---
title: "The State of Populism: Introducing the 2023 Wave of the Populism and Political Parties Expert Survey"
author: 
  - Andrej Zaslove, Radboud University, The Netherlands
  - Robert Huber, University of Salzburg, Austria
  - Maurits Meijers, University of Antwerp, Belgium (affliated with Radboud University)

abstract: \noindent To understand the evolution of party-based populism, reliable and valid measurement is essential. This article presents the second wave of the Populism and Political Parties Expert Survey, capturing populism with a continuous, five-item multidimensional latent construct in 312 political parties across 31 European countries in 2023. We validate the approach through Exploratory Factor Analysis (EFA), Confirmatory Factor Analysis (CFA) and Item Response Theory (IRT). The IRT findings confirm the discriminatory power of the five items, while CFA results establish strict measurement invariance from 2018 to 2023, supporting cross-temporal comparisons. Substantively, we show that radical right parties remain the most populist, followed by the radical left, with both nativism and left-wing economic stances as robust predictors of populism. Overall, average populism levels declined from 2018 to 2023, largely due to decreases among parties with weaker nativist and authoritarian tendencies.

format:
  pdf:
    documentclass: article
    geometry: "left=1in, right=1.5in, top=1in, bottom=1in"
    link-citations: true
    linkcolor: purple
    bibliography:
      - references.bib
      - session_packages.bib
    fig-cap-location: top
    fig-pos: H
    linestretch: 1.0
header-includes:
  - \usepackage{booktabs}
  - \usepackage{array}
  - \usepackage{siunitx}
  - \sisetup{group-digits=false, input-symbols = ( ) +}
  - \usepackage{float}
  - \floatplacement{table}{H}  
  - \usepackage{endnotes}
  - \let\footnote=\endnote
  - \renewcommand{\enoteformat}{\setlength{\leftskip}{0pt}\makebox[0pt][r]{\theenmark.\enskip}}

---


```{r}
#| echo: false
#| message: false
#| warning: false

library(here)
library(tidyverse)
library(sjmisc)
library(vtable)
library(psych) 
library(GPArotation)
library(mirt)
library(lavaan)
library(semTable)
library(modelsummary)
library(marginaleffects)
library(kableExtra)
library(patchwork)
library(ggrepel)
library(xtable)
library(conflicted)

```


::: {.callout-note}

* This document runs the Quarto file used to produce the paper. 
Please note that the written text in this document may differ slightly from the published version, as the latter has undergone copy editing.  
* We provide the Quarto file to help readers understand the rationale behind the code in the context of the article.  
* The R code can also be run from the accompanying R script file, which contains the same code exported from this Quarto document. 

:::


::: {.latex}

\doublespacing

:::


# Introduction 

Populist political parties are central actors in most European countries. In the last decade, populist parties appear more frequently across the left-right political spectrum and they are increasingly likely to participate in governing coalitions. Scholars have therefore increasingly scrutinized the behaviour of populist political parties, examining reasons for their success [@rooduijn2015], their larger impact on party competition [@wolinetz2018; @aichholzer2014], and their effects on (liberal) democracy [@mudde2012; @huber2017]. While much of this literature started qualitatively, studies of populist parties have undertaken a quantitative turn. A core requirement for solid quantitative studies is high quality data, ideally over time.

Recognizing these challenges, this paper introduces a new dataset: a second wave of the *Populism and Political Parties Expert Survey* (POPPA 2023). The first round of the POPPA expert survey (POPPA 2018) demonstrated that it was possible to use expert surveys to provide a valid and reliable measure of populist parties [@meijers2021]. POPPA 2023 continues the work of POPPA 2018 for the year 2023, while also including new items that measure party characteristics regarding democratic commitment, climate change, issue salience, party change, party views on the elite and the people, and political compromise. The dataset includes all relevant political parties in 28 countries in 2018 and 31 countries in 2023.^[Fewer countries were included in 2018 compared to 2023 due too low number of responses in several countries. Due to an oversight, Democratic Rally (DISY) from Cyprus was not included in Wave 2 (2023). This oversight will be corrected in subsequent versions of the data.] 

POPPA 2023 follows the methodological approach of the first wave, building on the *multidimensional*, *continuous*, and *comprehensive* method for capturing populism. *Multidimensionality* refers to the use of five sub-dimensions of populism to produce an aggregate measure of populism. These sub-dimensions operationalize the ideational approach of populism. This approach differs from other expert surveys, which use a limited number of questions to measure populism. For example, the Chapel Hill Expert Survey (CHES) [@polk2017] uses a single anti-elitism question, while V-Party uses two questions, an anti-elitism and people-centrism question. The Global Party Survey (GPS) uses two approaches: a single question and individual items to measure populism [@norris2020]. However, unlike the latter, POPPA 2023, specifically operationalizes the ideational approach [@mudde2004], treating populism as a latent variable that is measured by operationalizing each of the sub-dimensions of populism. 

The *continuous* nature of the approach refers to the use of a continuous scale which measures degrees of populism. Measuring populism along a continuous scale implies that political parties are more or less populist and that they cannot be classified as simply populist or not populist.  POPPA 2023 and POPPA 2018 differ in this regard from the approach put forward by the PopuList [@rooduijn2019], which uses a dichotomous measure of populism. Finally, the approach is *comprehensive*, providing a measure for every relevant party in parliament across European democracies. Whereas POPPA 2018 covered 249 parties in 28 countries, POPPA 2023 provides a measure of populism for 312 parties in 31 countries. The comprehensive nature of the data allows researchers to derive populism scores for all relevant parties in these European countries in 2018 and 2023, unlike other approaches which mainly code populist parties [such as the PopuList @rooduijn2019]. 

Expert surveys are a particularly valuable tool for assessing political parties’ policy positions and other traits or characteristics. This method allows researchers to deductively evaluate the *reputation* of political parties among experts, based on a predefined set of party features [@meijers2023a]. As a result, party positions and characteristics are captured at a high level of abstraction. The standardized nature of expert survey questions across countries and parties further enables meaningful cross-national and cross-party comparisons. In contrast, methods based on quantitative text analysis are more context-sensitive, reflecting the political and temporal circumstances in which the texts were produced [@hawkins2009; @licht2025]. While quantitative text analysis offers the advantage of processing large volumes of political texts—such as speeches, parliamentary debates, online posts, press releases, or manifestos—the content of these texts is highly dependent on their intended audience and context. For example, political actors tend to use more populist language in speeches than in party manifestos [@rooduijn2011; @hawkins2018]. Hence, expert surveys are especially well-suited for collecting broadly comparable data across a wide array of parties and contexts. At the same time, expert survey data is less suitable for research questions that specifically aim to understand the role of particular audiences and political settings in shaping politicians’ speech acts.

In this paper, we investigate data quality of POPPA. We do so first by scrutinizing the multidimensionality of the populism measure. We use exploratory factor analysis (EFA), confirmatory factor analysis (CFA) and latent item response theory (IRT) to evaluate the multiple indicators used to measure the latent construct of populism. We then use invariance tests to evaluate the degree to which the measure performs equivalently across both waves. The paper then proceeds to illustrate the benefits of the continuous and the comprehensive approach. We use descriptive analysis of party families and OLS regression to highlight patterns between party families and regions and to focus on the predictors of populism, both contemporaneously and temporally. 

The paper finds that the POPPA populism measure loads on a single dimension, verified via an exploratory factor analysis (EFA) and a confirmatory factor analysis (CFA). Item response theory (IRT) demonstrates the coverage and the discriminatory nature of the five populism items. Importantly, the paper finds that the populism measure is invariant across the two waves, implying that changes in populism scores can be interpreted as real changes in degrees of populism, contemporaneously and temporally. 

The descriptive analysis demonstrates that the radical right party family is the most populist, followed by the radical left, while Christian democratic, Green and liberal parties are the least populist. The radical left, Christian democrats, social democratic, green and in particular conservatives and liberal parties have become less populist in Wave 2. We also find distinctive differences between the regions among party families. OLS regression analysis demonstrates the degree to which nativism and left-wing economic positions remain robust predictors of higher degrees of populism. Importantly, the OLS regression models also show that political parties in Wave 2 are less populist than political parties in Wave 1, while nativism, and to a lesser degree authoritarianism, are stronger predictors of populism in Wave 2 than in Wave 1. However, this effect is largely driven by the decline in levels of populism among parties that are less nativist and less authoritarian.

::: {.latex}

\singlespacing

:::

```{r}
#| echo: false

# We import the data and we elminate the two populism variables. 
# We have added these variables to the dataset based on the analysis here, i.e. after the fact.
# Since we create these variable in this paper we need to remove them for the analysis.


df_integrated <- readRDS(here("data/poppa_integrated.rds")) %>%
  select(-c(populism_cfa,
            populism_cfa_rescaled))


# We make wave a factor. wave 1 is the reference category. 

df_integrated$wave <- forcats::as_factor(df_integrated$wave) 


# We make the party family variable a factor and Christian democracy the reference category. 

df_integrated$family_label <- forcats::as_factor(df_integrated$family_label)

df_integrated$family_label <- forcats::fct_relevel(df_integrated$family_label, "Christian Democratic")


df_integrated <- df_integrated %>%
  mutate(regions = forcats::fct_collapse(country,
                                         West = c("Austria",
                                                  "Belgium",
                                                  "Denmark",
                                                  "Finland",
                                                  "France",
                                                  "Germany",
                                                  "Iceland",
                                                  "Ireland",
                                                  "Luxembourg",
                                                  "Netherlands",
                                                  "Norway",
                                                  "Sweden",
                                                  "Switzerland",
                                                  "United Kingdom"),
                                         South = c("Cyprus",
                                                   "Greece",
                                                   "Italy",
                                                   "Malta",
                                                   "Portugal",
                                                   "Spain"),
                                         East  = c("Bulgaria",
                                                   "Croatia",
                                                   "Czech Republic",
                                                   "Estonia",
                                                   "Hungary",
                                                   "Latvia",
                                                   "Lithuania",
                                                   "Poland",
                                                   "Romania",
                                                   "Slovakia",
                                                   "Slovenia"))) 


# We reverse the lifestyle item for use below. 
# This transforms the lifestyle variable as lower values more liberal and higher values more traditional.
# We create a new variable for them with `_rev`. 

df_integrated <- df_integrated %>%
  mutate(lifestyle_rev = (10 - lifestyle)) 


# We create an authoritarian item by combining lifestyle_rev and laworder.
# This makes higher values as more authoritarian.

df_integrated <- df_integrated %>%
  dplyr::rowwise() %>%
  mutate(authoritarian = mean(c(lifestyle_rev, laworder))) %>%
  ungroup()


# We mean center nativism, authoritarian and lrecon for the interaction effects. 

df_integrated <- df_integrated %>% 
  mutate(nativism_centered = nativism - mean(nativism, na.rm = TRUE),
         authoritarian_centered = authoritarian - mean(authoritarian, na.rm = TRUE),
         lrecon_centered = lrecon - mean(lrecon, na.rm = TRUE) 
  )


# We create two vectors for graphing. It is easier to do now so we know where to find them if we move things around later.

populist_items <- c(
  "Anti-Elitism" = "antielitism",
  "People Centrism" = "peoplecentrism",
  "Manichean" = "manichean",
  "General Will" = "generalwill",
  "Indivisible" = "indivisible"
)


# We create a vector to order the items for later.

populism_items_order <- c("Anti-Elitism",
                          "People Centrism",
                          "General Will",
                          "Manichean",
                          "Indivisible") 



# We also create a custom colour palette for plots for later. 

custom_palette <- c(
  "#ffbb78",  # Light Orange (Light)
  "#7f7f7f",  # Gray (Dark)
  "#f7b6d2",  # Light Pink (Light)
  "#98df8a",  # Light Green (Light)
  "#c49c94",  # Light Brown (Light)
  "#17becf",  # Cyan (Medium)
  "#bcbd22",  # Yellow-Green (Medium)
  "#2ca02c",  # Green (Dark)
  "#1f77b4",  # Blue (Dark)
  "#e377c2",  # Pink (Light)
  "#ff7f0e",  # Orange (Medium)  "#d62728",  # Red (Dark)
  "#9467bd",  # Purple (Dark)
  "#008080"   # Dark Teal 
)


``` 

::: {.latex}

\doublespacing

:::

# The POPPA Approach: How to Measure Populism

POPPA 2023 includes items to operationalize the core features of the ideational approach [@mudde2004; @mudde2017; @muller2017]: the people-elite distinction, the Manichean notion of politics, and the idea that the people are homogeneous and indivisible. POPPA 2023 uses the same five items to measure populism that were employed in the 2018 survey (see @tbl-populist-items).^[The wording for the anti-elitism question is slightly different in 2023 in comparison with 2018. See the codebook for more details. POPPA 2018 also contained items that measured populism as a style and populism as a strategy. However, POPPA 2023 focuses exclusively on the ideational approach.]

The second wave of the Populism and Political Parties Expert Survey 2023 (POPPA 2023) was fielded between January 25, 2023 and May 24, 2023 in 31 countries.^[Originally, two separate surveys were fielded in Flanders and in Walloon, but were subsequently merged under Belgium.] In total, 850 experts were contacted, 324 fully completed the survey, with a response rate of 38 percent. POPPA 2023 is a continuation of POPPA 2018, repeating the core questions from POPPA 2018. @tbl-experts-per-country in the Online Appendix shows the mean, minimum  and maximum number of experts per country. We use a base minimum of 4 experts per party per item. The mean for Wave 2 in 2023 is 8.87, with a maximum of 29 experts. The mean for Wave 1 in 2023 is 9.44, with a maximum of 18 responses. See the codebook for a closer breakdown of the number of experts per political party per item.

::: {.latex}

\singlespacing

:::

::: {#tbl-populist-items}

```{=latex}

\begin{table}
\centering
\begin{tabular}{p{0.9\textwidth}} 
\toprule  
\midrule 
\addlinespace[10pt]  
\textbf{Anti-Elitism}: Some parties can be characterized by their attitudes toward the establishment and toward elites. This is often referred to as anti-elitism. \\
\vspace{-5pt}  % Reduce space between rows
\textbf{People-Centrism}: Some parties are very people-centred and believe that sovereignty should lie exclusively with the ordinary people (i.e. the ordinary people, not the elites, should have a final say in politics). \\
\vspace{-5pt}  
\textbf{General Will}: Some parties consider the ordinary people’s interests to be singular (i.e. one can speak of a general will). \\
\vspace{-5pt}  
\textbf{Manichean}: Some parties see politics as a moral struggle between good and bad. This is often described as a Manichean worldview. \\
\vspace{-5pt}  
\textbf{Indivisible}: Some parties consider the ordinary people to be indivisible (i.e. the people are seen as homogenous). \\
\addlinespace[10pt]  
\midrule  
\bottomrule  
\end{tabular}
\end{table}

```

Populist Items

:::

::: {.latex}

\doublespacing

:::

POPPA uses the thin-centered ideology definition of populism [@mudde2004; @mudde2017]. The thin-centered nature of populism implies that populism does not cover a comprehensive worldview over a broad range of policy domains. As such, populism is usually combined with an attaching ideology. Populism differs in this regard from 'full' ideologies, such as liberalism and socialism. The thin-centered nature of populism implies that populist parties exist across the left-right political spectrum. In order to operationalize these attaching dimensions, POPPA 2023 also contains a series of expert survey items that measure left-right overall placement, left-right economic positions, positions regarding immigration, nativism, law and order, lifestyle^[Lifestyle measure degrees of moral conservatism vis-a-vis more liberal positions.] and party positions regarding European integration. 

In addition, POPPA 2023 includes new items, measuring democratic commitment, climate change, issue salience, party change, party views on the elite and the people, and political compromise. These items are integrated into the POPPA 2023 dataset to address pressing research questions. The POPPA 2023 dataset is fully integrated with POPPA 2018, providing a measure for populism and attaching dimensions over two waves.^[The integrated dataset also includes several variables from the 2018 dataset that were not repeated in the newer version. These include measures for complexity, emotional appeals, intra-party democracy, and personalized leadership.] 

@tbl-summary-statistics shows the summary statistics from Wave 2 and from Wave 1. The last column in the table shows the p-values for the t-tests, comparing items between the two waves. Summary statistics for the combined waves can be found in @tbl-summary-statistics-full-appendix in the Online Appendix. Based on the directly measured five sub-dimensions of populism, POPPA 2023 includes a populism variable derived from the mean as well as from the regression scores of CFA analysis (both standardized and rescaled), as we discuss below. 

::: {.latex}

\singlespacing

:::

```{r}
#| echo: false
#| output: asis
#| tbl-cap: Descriptive Statistics for Wave 2 and Wave 1
#| label: tbl-summary-statistics


# We create a descriptive table for both waves.
# we use the vtable package.
# We create a new dataframe for the vtable to be used for the descriptive table. 
# We do this just to keep things separate since we do some special manipulations for the table.


df_integrated_table <- df_integrated


# We run the CFA for the descriptive table. 
# The explanation and the full process for creating the CFA populism variable can be found below.


cfa_cov_3_table <- '

populism_cfa_cov_3_table =~ manichean + generalwill + peoplecentrism + antielitism + indivisible

generalwill ~~ indivisible
peoplecentrism ~~ antielitism
manichean ~~  antielitism

'

# Fit the model to the data.

fit_cov_3_table <- cfa(model = cfa_cov_3_table, data = df_integrated_table)


# We create a function to attach the regression scores to the dataframe.
# We do this since we attach factor scores several times and we want to be sure that there is no overlap or confusion between dataframes.


attach_factor_scores <- function(df, fit_model) {
  # Step 1: Extract case indices
  idx <- lavInspect(fit_model, "case.idx")
  
  # Step 2: Extract factor scores
  fscores <- lavPredict(fit_model, method = "regression")
  
  # Step 3: Loop over each factor score and attach to the dataframe
  for (fs in colnames(fscores)) {
    # Attach the factor scores to the dataframe using the case indices
    df[idx, fs] <- fscores[, fs]
  }
  
  # Return the modified dataframe
  return(df)
}

# We attach the regression scores to the dataset with the function we just created.

df_integrated_table <- attach_factor_scores(df_integrated_table, fit_cov_3_table)


# We rescale the CFA populism variable.

df_integrated_table$populism_cfa_resc_table <- scales::rescale(df_integrated_table$populism_cfa_cov_3_table, to = c(0, 10))


# We create a column for the t-test.

# We initialize an empty vector to store p-values.

p_values <- c()

# We list the variables to perform the t-tests on.

variables <- c("populism_mean", "populism_cfa_resc_table", "populism_cfa_cov_3_table", "antielitism", "generalwill", "manichean", "indivisible", 
               "peoplecentrism", "lroverall", "lrecon", "immigration", "nativism", "authoritarian",
               "laworder", "lifestyle", "eu")

# We perform the t-tests and store the p-values.

for (var in variables) {
  wave1_data <- df_integrated_table %>% dplyr::filter(wave == "Wave 1 - 2018") %>% pull(var)
  wave2_data <- df_integrated_table %>% dplyr::filter(wave == "Wave 2 - 2023") %>% pull(var)
  
  t_test_result <- t.test(wave1_data, wave2_data, var.equal = FALSE) # This preforms the Welch's t-test.
  p_values <- c(p_values, t_test_result$p.value)
}

# We add the p-values and variables to the dataframe.

p_values_df <- data.frame(Variable = variables, P_Value = round(p_values, 2))


# We create the descriptive table with vtable.
# We first rename the variables. 

labs_waves_shorter <- data.frame(populism_mean = 'Populism (Mean)',
                                 populism_cfa_resc_table = 'Populism (CFA: Rescaled)',
                                 populism_cfa_cov_3_table = 'Populism (CFA: Standardized)',
                                 antielitism = 'Anti-Elitism',
                                 generalwill = 'Genernal Will',
                                 manichean   = 'Manichean',
                                 indivisible = 'Indivisible',
                                 peoplecentrism = 'People Centrism',
                                 lroverall = 'Left-Right Overall',
                                 lrecon = 'Left-Right Economy',
                                 immigration = 'Immigration',
                                 nativism = 'Nativism',
                                 authoritarian = 'Authoritarian',
                                 laworder = 'Law and Order',
                                 lifestyle = 'Lifestyle',
                                 eu = 'European Integration',
                                 wave = 'Wave')


# We create the descriptive table with vtable.

table_waves_shorter <- df_integrated_table %>%
  select(populism_mean,
         populism_cfa_resc_table,
         populism_cfa_cov_3_table,
         antielitism,
         generalwill, 
         manichean,  
         indivisible, 
         peoplecentrism, 
         lroverall, 
         lrecon,
         immigration,
         nativism, 
         authoritarian,
         laworder,
         lifestyle,
         eu,
         wave) %>%
  mutate(wave = forcats::fct_rev(wave) ) %>%
  vtable::sumtable(labels = labs_waves_shorter,
                   group = "wave",
                   out = 'return')

# We do some manipulation to join the p-value column to the dataset.

# We insert an empty value at the top of the p_values_df to align with the header row.
p_values_with_header <- c(NA, p_values_df$P_Value)

# We convert it back to a dataframe for consistency.
p_values_with_header_df <- data.frame(P_Value = p_values_with_header)

# We append the p-values column to the table, which now includes a placeholder for the header row.
table_waves_shorter <- cbind(table_waves_shorter, p_values_with_header_df)

# We replace NA in the P_Value column with an empty string and rename the column. 
table_waves_shorter$P_Value[is.na(table_waves_shorter$P_Value)] <- ""

# We rename the P_Value column.
colnames(table_waves_shorter)[colnames(table_waves_shorter) == "P_Value"] <- "T-Test"

# We remove the first row from the data.
table_waves_shorter <- table_waves_shorter[-1, ]


table_waves_shorter %>%
  kable(
    format = "latex",  
    booktabs = TRUE,
    linesep = "",
    row.names = FALSE
  ) %>%
  kable_classic() %>% 
  kable_styling(
    full_width = FALSE, 
    position = "center", 
    bootstrap_options = c("striped", "hover"),
  ) %>%
  column_spec(2, width = "8em") %>%
  column_spec(5, width = "8em") %>%
  add_header_above(c(" " = 1, "Wave 2 (2023)" = 3, "Wave 1 (2018)" = 3, " " = 1)) %>%
  row_spec(0, bold = TRUE) %>%
  footnote(general = "Welch two-sided t-test",
           general_title = "Note:", 
           footnote_as_chunk = TRUE)

```

::: {.latex}

\doublespacing

:::

The summary statistics in @tbl-summary-statistics show several interesting patterns.^[This paper was written and the analyses were conducted using Quarto. Analyses were conducted using R (version 4.5.1) and RStudio (2025.5.0.496). 
The following R packages were used for statistical analysis, table generation, and visualization: R [@R-base], conflicted [@R-conflicted], dplyr [@R-dplyr], forcats [@R-forcats], ggplot2 [@R-ggplot2; @ggplot22016], ggrepel [@R-ggrepel], GPArotation [@R-GPArotation; @GPArotation2005], here [@R-here], kableExtra [@R-kableExtra], lattice [@R-lattice; @lattice2008], lavaan [@R-lavaan; @lavaan2012], lubridate [@R-lubridate; @lubridate2011], marginaleffects [@R-marginaleffects; @marginaleffects2024], mirt [@R-mirt; @mirt2012], modelsummary [@R-modelsummary; @modelsummary2022], patchwork [@R-patchwork], psych [@R-psych], purrr [@R-purrr], readr [@R-readr], semTable [@R-semTable], sjmisc [@R-sjmisc; @sjmisc2018], stringr [@R-stringr], tibble [@R-tibble], tidyr [@R-tidyr], tidyverse [@R-tidyverse; @tidyverse2019], vtable [@R-vtable], and xtable [@R-xtable].]   First, we see that, on average, a picture of ideological stability emerges, i.e., the mean and the standard deviations are very similar between Wave 1 and Wave 2 (see also @whitefield2007). In some cases, however, the mean values show statistically significant differences between the waves. These differences are related to populism and immigration. All three populism composite measures (the mean and the mean of the regression scores, standardized and rescaled) demonstrate lower populism scores for Wave 2, with p-values equal to or less than 0.05.^[The mean of the regression scores represents the difference between the overall mean from both waves and the mean from the individuals waves.] Further analysis of the latent mean of the populism variable also demonstrates that parties in Wave 2 are, on average, less populist than in Wave 1 (See @tbl-latent-mean).^[In order to estimate the change in the latent mean across waves, we run a multigroup confirmatory factor analysis. We constrain the factor loadings and the item intercepts to be equal across the waves, i.e. a scalar model. The latent means of the first group (Wave 1) are fixed to zero. The latent means of the second group (Wave 2) are estimated relative to the baseline of the first group (Wave 1). This allows us to assess the degree of change between the latent construct in Wave 1 and Wave 2. This provides the mean for the latent factor of the specific wave. The analysis shows that there is a 0.342 standard deviation decline in the latent mean of populism in Wave 2, with a p-value of 0.025, indicating a small to moderate effect size.] Items measuring the general will, manicheanism, indivisible and people-centrism also exhibit lower scores, with p-values of less than 0.05. Immigration shows a mean value of 0.5 points higher in Wave 2, with a p-value of 0.05, implying that political parties have become slightly more favorable to immigration. We also see a slight increase in the level of nativism, but the difference value is not statistically significant.

::: {.latex}

\singlespacing

:::

\vspace{10pt}

```{r}
#| echo: false
#| output: asis
#| label: tbl-latent-mean
#| tbl-cap: Latent Mean for Populism CFA

# We create a table for the latent means.
# To avoid colliding variables with Lavaan we create a new dataset for the latent means.

df_integrated_latent_means <- df_integrated


# We run the same model for the CFA.

cfa_cov_3_means <- '

populism_cfa_cov_3_means =~ manichean + generalwill + peoplecentrism + antielitism + indivisible

generalwill ~~ indivisible
peoplecentrism ~~ antielitism
manichean ~~  antielitism

'

# We run a  model to obtain the latent means.

fit_cov_3_means <- cfa(model = cfa_cov_3_means, data = df_integrated_latent_means,
                            group = "wave",
                            group.equal = c("loadings", "intercepts"))    



# We extract parameter estimates including latent means.

estimates <- parameterEstimates(fit_cov_3_means)


# We create a table with the latent means. 

# We prepare the data as a dataframe.
estimates_table <- estimates %>%
  dplyr::filter(lhs == "populism_cfa_cov_3_means",
                op == "~1",      
                group == 2) %>%  # group 2 is Wave 2 (estimated)
  select(
    Variable = lhs,
    Est = est,
    SE = se,
    `p-value` = pvalue
  ) %>%
  mutate(
    Variable = "Difference in Populism (Wave 2 - Wave 1)",
    Est = round(Est, 3),
    SE = round(SE, 3),
    `p-value` = round(`p-value`, 3)
  )

# We create table with kableExtra.

estimates_table %>%
  kable(
    format = "latex", 
    booktabs = TRUE, 
    col.names = c("Variable", "Estimate", "Std. Error", "p-value"),
    escape = FALSE
  ) %>%
  kable_classic() %>% 
  kable_styling(
    full_width = FALSE, 
    position = "center",
  )

```

\vspace{10pt}

::: {.latex}

\doublespacing

:::

This initial examination of the data already points to some interesting patterns. First, there is considerable consistency between the two waves. As noted, the mean values and the standard deviations are comparable. However, there is also a small but statistically significant decline in populism between the waves and a marginal increase in support for immigration. We delve further into these changes below. 

# Populism as a Multidimensional and Latent Construct

The multidimensional nature of populism requires the use of multiple indicators to capture its sub-dimensions. In this section we ascertain whether these five items capture the latent construct of populism in both waves. First, we examine whether the five items form a single factor, namely populism. To this end, we use an exploratory factor analysis (EFA). This helps us to understand whether the five items load consistently on a single factor and whether the structure of the factor changes between the different waves. Second, we use a confirmatory factor analysis (CFA) to investigate how well the data fits the underlying theoretical concept. We subsequently employ an invariance test to further systematically test the structure of the factor across waves. Finally, item response theory (IRT) models allow us to investigate how informative the individual items are for the overall factor. 

## Exploratory Factor Analysis (EFA)

We start by scrutinizing whether the five items form one internally consistent factor. An EFA investigation usually starts by determining the number of potential factors. We anticipate a one-dimensional scale. @tbl-factor-analysis shows the results from the EFA for three models: for the full dataset as well as Wave 1 and Wave 2 separately. The results indicate that the data fits a single factor model. The individual factor loadings for the three models, indicate, however, slight differences between the waves. The factor loadings for anti-elitism and people-centrism demonstrate higher values in Wave 2, while the factor loadings for general will and indivisible are slightly higher in Wave 1. 

Commonality and uniqueness scores also suggest that our data fits the one-factor solution well. Commonality scores capture the the amount of variance of the observed variable that is explained by the factor, while uniqueness scores show how much of the variance is unique or not explained by the factor. In other words, they capture, how well the factor explains the individual items. As such, it is desirable to obtain high commonality scores and low uniqueness scores. Despite the general good fit of the data for all three models, @tbl-factor-analysis shows that commonality scores are slightly lower for Wave 1. This is especially the case for anti-elitism and to a lesser degree people-centrism. Uniqueness scores are also a touch higher for the same items in Wave 1. Nonetheless, we conclude that the data, for the full model and for the individual waves, fits a single factor well, despite these smaller discrepancies. To further tease out the relationship between the individual indicators and the populism construct we conduct a confirmatory factor analysis in the next section.  

::: {.latex}

\singlespacing

:::

```{r}
#| echo: false
#| label: tbl-factor-analysis
#| tbl-cap: Exploratory Factor Analysis

# We run a EFA with maximum likelihood estimation.
# Since we only have on dimension we cannot rotate.

factor_full_dataset <- df_integrated %>%
  select(all_of(populist_items)) %>%
  rename_with(~ names(populist_items), everything())%>%
  psych::fa(nfactors = 1, SMC = TRUE, fm = 'ml')

factor_wave1_dataset <- df_integrated %>%
  dplyr::filter(wave == "Wave 1 - 2018") %>%
  select(all_of(populist_items)) %>%
  rename_with(~ names(populist_items), everything())%>%
  psych::fa(nfactors = 1, SMC = TRUE, fm = 'ml')

factor_wave2_dataset <- df_integrated %>%
  dplyr::filter(wave == "Wave 2 - 2023") %>%
  select(all_of(populist_items)) %>%
  rename_with(~ names(populist_items), everything())%>%
  psych::fa(nfactors = 1, SMC = TRUE, fm = 'ml')


# We create a table to show the results.

# We create a function to prepare loadings for the data frame.

prepare_loadings_df <- function(factor_result) {
  loadings <- as.data.frame(factor_result$loadings[])
  loadings$Communality <- factor_result$communality
  loadings$Uniqueness <- factor_result$uniquenesses
  loadings$Complexity <- factor_result$complexity
  loadings <- tibble::rownames_to_column(loadings, "Variable")
  loadings <- loadings %>% mutate(across(where(is.numeric), ~round(.x, 3)))
  loadings <- rename(loadings, `Factor 1` = `ML1`)
  return(loadings)
}

# We prepare the loadings for the the dataframes.

loadings_full <- prepare_loadings_df(factor_full_dataset)
loadings_wave1 <- prepare_loadings_df(factor_wave1_dataset)
loadings_wave2 <- prepare_loadings_df(factor_wave2_dataset)



# We combine the results into a single table.

# We create the headings for each section.

heading_full <- data.frame(
  Variable = "Full Dataset", 
  `Factor 1` = NA, 
  Communality = NA, 
  Uniqueness = NA, 
  Complexity = NA
)

heading_wave1 <- data.frame(
  Variable = "Wave 1", 
  `Factor 1` = NA, 
  Communality = NA, 
  Uniqueness = NA, 
  Complexity = NA
)

heading_wave2 <- data.frame(
  Variable = "Wave 2", 
  `Factor 1` = NA, 
  Communality = NA, 
  Uniqueness = NA, 
  Complexity = NA
)

# We combine the results into a single table.

efa_table <- bind_rows(
  heading_full, loadings_full,
  heading_wave1, loadings_wave1,
  heading_wave2, loadings_wave2
) %>%
  select(-Factor.1) %>%
  relocate(`Factor 1`, .after = Variable) %>%
  mutate(across(everything(), ~ifelse(is.na(.), "", .)))

efa_table %>%
  kable(format = "latex", booktabs = TRUE, escape = FALSE, linesep = "") %>%
  kable_classic %>% 
  kable_styling(full_width = FALSE, font_size = 8) %>%
  row_spec(which(efa_table$Variable %in% c("Full Dataset", "Wave 1", "Wave 2")), bold = TRUE) %>%
  column_spec(1, width = "12em") %>%  
  column_spec(2:5, width = "6em")    

```

::: {.latex}

\doublespacing

:::

## Confirmatory Factor Analysis (CFA)
The CFA provides a rigorous test of the latent structure by explicitly fitting a theoretical model and offering model fit statistics. We run CFA models on the full dataset (i.e. both waves) and in order to examine consistency over time we also conduct a multigroup CFA, allowing us to compare similarities and differences across waves.

Conducting a CFA is an iterative process. We start with a theoretically informed baseline model where all five items load on a latent populism concept, with no additional specifications. The initial model does not provide a good fit (CFI = 0.834, RMSEA = 0.461, SRMR = 0.048).^[Fit statistics in reported in tables are rounded to two decimal points.] Most indicators, except the SRMR, fall below the threshold for an acceptable fit (CFI ≥ 0.95, RMSEA ≤ 0.05, SRMR ≤ 0.05; see: @hu1999). Multigroup fit results are similar (see the tables @tbl-cfa-full_unstd, @tbl-cfa-multi-group-cov-1, @tbl-cfa-multi-group-cov-2; @tbl-cfa-multi-group-cov-3 in the Online Appendix).

To improve the model fit, we examine the modification indices and we take theoretical considerations into account.^[The process is outlined more in more detail in @tbl-modification-indices-basic, @tbl-modification-indices-cov-1, @tbl-modification-indices-cov-2, @tbl-modification-indices-cov-3 in the Online Appendix.] The modification indices suggest that adding a residual covariance for *general will* with *indivisible people* could improve the model fit. This adjustment yields better results (CFI = 0.963, RMSEA = 0.242, SRMR = 0.037), but the RMSEA remains high, indicating further adjustments are needed. We then add a residual covariance between *manicheanism* and *people-centrism* based on high modification indices, though this residual covariance reveals issues of negative correlation and multicollinearity. As such, we add a residual covariance for *people-centrism* and *anti-elitism*, based on the modification indices and on the idea that both are core aspects of populism [@mudde2004]. This model further improves the model fit (CFI = 0.983, SRMR = 0.012), but the RMSEA = 0.188 remains suboptimal. 

Finally, we move to the last model: adding a third residual covariance between *manicheanism* and *anti-elitism* results in an excellent fit (CFI = 0.999, RMSEA =0.045, RMSEA.PVALUE = 0.453, SRMR = 0.004), with a non-significant Chi-squared, indicating minimal deviation between observed and predicted covariances. The multigroup model produces similar results (see @tbl-cfa3 and @tbl-cfa-mg-3).^[Tables with full results from the CFA models, including the multigroup models, can be found in the Online Appendix. In the text we provide unstandardized estimates. Standardized estimates for all CFA models, with the exception of the multigroup models can be found in the Online Appendix.]

In summary, the use of CFA modeling demonstrates that the five items representing the sub-dimensions of populism effectively capture the latent construct of populism, with particularly robust performance when covariances between dimensions are included. The multigroup models also demonstrate a good fit across different waves, supporting consistency over time. To further validate this, we, however, conduct an invariance test to assess the stability of the populist structure across waves.

::: {.latex}

\singlespacing

:::

\vspace{10pt}

```{r}
#| echo: false
#| eval: true
#| output: false

# We run the full CFA models for all four models (baseline model; 1 residual covariance; 2 residual covariances; 3 residual convariances).
# We also produce the fit for the models and we produce the tables for the models. 
# We only use the models with 3 convariances in the main document for the analyses.
# We render the other models in the tables in the Appendix. 
# We also show the modification indices in the Appendix with a discussion of the proofs. 


# --------------

# We fit the CFA for the populist items. 

# 1. This is the Baseline Model. 

# Define the model

cfa_basic_full <- '

populism_cfa_basic =~ manichean + generalwill + peoplecentrism + antielitism + indivisible

'

# Fit the model to the data

fit_basic_full <- cfa(model = cfa_basic_full, data = df_integrated)


# Summarize the results

# summary(fit_basic_full, fit.measures = TRUE, standardized = TRUE)

# modificationIndices(fit_basic_full, sort = TRUE)

# The iterations ends normally and the degrees of freedom are 5.
# But the `fit_basic_full` model does not produce a very good fit.


# 2. This model will allow the residuals of 'generalwill' and 'indivisible' to covary.
# The modification index (MI) from the baseline model indicates a high value for the residual covariance of 'generalwill' and 'indivisible', suggesting that adding this parameter would significantly improve the model fit.
# Theoretically, it is reasonable to assume that experts might have interpreted these two items in a similar way, which would affect the residual/error terms of these variables. Therefore, specifying this residual covariance helps account for shared variance due to similar interpretations.


# Define the model with residual covariances

cfa_cov_1_full <- '

  populism_cfa_cov_1 =~ manichean + generalwill + peoplecentrism + antielitism + indivisible
  
  generalwill ~~ indivisible
  
'


# Fit the model to the data

fit_cov_1_full <- cfa(model = cfa_cov_1_full, data = df_integrated)


# Summarize the results

# summary(fit_cov_1_full, fit.measures = TRUE, standardized = TRUE)

# modificationIndices(fit_cov_1_full, sort = TRUE)


# The model ends normally and there are 4 degrees of freedom.
# The model `cfa_cov_1_full` produces a better fit but we can still improve on the fit. 


# 3. This model allows the residuals of 'generalwill' and 'indivisible' to covary and in addition it covaries 'manichean' and 'peoplecentrism'. 
# The modification index (MI) from the model with 1 residual covariance indicates a high value for the residual covariance between 'manichean' and 'peoplecentrism', suggesting that adding this parameter would significantly improve the model fit. 
# There is also theoretical reasons for why these two items might share residuals/ error terms.
# As we will see, there appears to be, however, an issue of negative correlation between these items. 

cfa_cov_2_full_manichean <- '

populism_cfa_cov_2 =~ manichean + generalwill + peoplecentrism + antielitism + indivisible

generalwill ~~ indivisible
manichean ~~ peoplecentrism

'


# Fit the model to the data

fit_cov_2_full_manichean <- cfa(model = cfa_cov_2_full_manichean, data = df_integrated)


# Summarize the results

# summary(fit_cov_2_full_manichean, fit.measures = TRUE, standardized = TRUE)

# modificationIndices(fit_cov_2_full_manichean, sort = TRUE)

# There are issues with this model. We see that there are negative correlations between `manichean ~~ peoplecentrism`
# If we run the modification index we see that there is an issue of correlation between `manichean peoplecentrism`


# 4. The previous model covaried `manichean ~~ peoplecentrism`. This produced a negative residual covariance. Subsequent analysis with this route showed problems.
# We, therefore run a new model which allows the residuals of 'generalwill' and 'indivisible' to covary and in addition it covaries 'peoplecentrism' and 'antielitism'. 
# Since the previous model where we include the covariation of `manichean ~~ peoplecentrism` produced issues we have opted for this second route. 
# The modification index (MI) from the model with 1 convariance indicates a high value for the residual covariance between 'peoplecentrism' and 'antielitism', suggesting that adding this parameter would significantly improve the model fit. 
# There is also theoretical reasons for why these two items might share residuals/ error terms. 


cfa_cov_2_full <- '

populism_cfa_cov_2 =~ manichean + generalwill + peoplecentrism + antielitism + indivisible

generalwill ~~ indivisible
peoplecentrism ~~ antielitism

'


# Fit the model to the data

fit_cov_2_full <- cfa(model = cfa_cov_2_full, data = df_integrated)


# Summarize the results

# summary(fit_cov_2_full, fit.measures = TRUE, standardized = TRUE)

# modificationIndices(fit_cov_2_full, sort = TRUE)

# The model ended normally with 3 degrees of freedom. 
# The model fit is good but we can still improve the model. 

# 5. We run the modification indices again. We see that two sets of items produce high values: `manichean ~~ antielitism` and `manichean ~~ peoplecentrism`. We choose to covary `manichean ~~ antielitism` do to the earlier issues with `manichean ~~ peoplecentrism`.
# This model allows the residuals of 'generalwill' and 'indivisible' to covary, peoplecentrism and antielitism (i.e. as in the previous model) and in addition it now covaries 'manichean' and 'antielitism'. 
# There is also theoretical reasons for why these two items might share residuals/ error terms. 


cfa_cov_3_full <- '

populism_cfa_cov_3 =~ manichean + generalwill + peoplecentrism + antielitism + indivisible

generalwill ~~ indivisible
peoplecentrism ~~ antielitism
manichean ~~  antielitism

'

# Fit the model to the data

fit_cov_3_full <- cfa(model = cfa_cov_3_full, data = df_integrated)


# Summarize the results

# summary(fit_cov_3_full, fit.measures = TRUE, standardized = TRUE)

# modificationIndices(fit_cov_3_full, sort = TRUE)


# The model ends normally with 2 degrees of freedom. 
# The model produces a very good fit. 
# The modification indices show no more high values. 


# --------- Multigroup models -----------------

# We run each model as a multigroup model. 

fit_mgm_basic <- cfa(cfa_basic_full, data = df_integrated, group = "wave")

# summary(fit_mgm_basic, fit.measures = TRUE, standardized = TRUE)

fit_mgm_cov_1 <- cfa(cfa_cov_1_full, data = df_integrated, group = "wave")

# summary(fit_mgm_cov_1, fit.measures = TRUE, standardized = TRUE)

fit_mgm_cov_2 <- cfa(cfa_cov_2_full, data = df_integrated, group = "wave")

# summary(fit_mgm_cov_2, fit.measures = TRUE, standardized = TRUE)

fit_mgm_cov_3 <- cfa(cfa_cov_3_full, data = df_integrated, group = "wave")

# summary(fit_mgm_cov_3, fit.measures = TRUE, standardized = TRUE)



# We create tables for the CFA models. 
# We only show the models with 3 residual covariances in the main text. The rest we show in the Appendix. 

# We create tables for the CFA. 

vlabs_five_basic <- c("populism_cfa_basic" = "Populism" , "manichean" = "Manichean" , "generalwill" = "General Will" , "peoplecentrism" = " People Centrism" , "antielitism" = "Anti-Elitism" , "indivisible" = "Indivisible")

vlabs_five_cov_1 <- c("populism_cfa_cov_1" = "Populism" , "manichean" = "Manichean" , "generalwill" = "General Will" , "peoplecentrism" = " People Centrism" , "antielitism" = "Anti-Elitism" , "indivisible" = "Indivisible")

vlabs_five_cov_2 <- c("populism_cfa_cov_2" = "Populism" , "manichean" = "Manichean" , "generalwill" = "General Will" , "peoplecentrism" = " People Centrism" , "antielitism" = "Anti-Elitism" , "indivisible" = "Indivisible")

vlabs_five_cov_3 <- c("populism_cfa_cov_3" = "Populism" , "manichean" = "Manichean" , "generalwill" = "General Will" , "peoplecentrism" = " People Centrism" , "antielitism" = "Anti-Elitism" , "indivisible" = "Indivisible")


semTable(fit_basic_full, type = "latex",
         columns = c("est", "se", "p"),
         columnLabels = c(est = "Estimate",
                          se = "Std. Err.",
                          p = "p-value"),
         varLabels = vlabs_five_basic,
         fits = c("cfi", "chisq", "rmsea", "rmsea.pvalue", "srmr"), 
         file = "cfa_tables/fit_basic_full.tex") 

 semTable(fit_cov_1_full, type = "latex",
         columns = c("est", "se", "p"),
         columnLabels = c(est = "Estimate",
                          se = "Std. Err.",
                          p = "p-value"),
         varLabels = vlabs_five_cov_1,
         fits = c("cfi", "chisq", "rmsea", "rmsea.pvalue", "srmr"), 
         file = "cfa_tables/fit_cov_1_full.tex") 

  semTable(fit_cov_2_full, type = "latex",
         columns = c("est", "se", "p"),
         columnLabels = c(est = "Estimate",
                          se = "Std. Err.",
                          p = "p-value"),
         varLabels = vlabs_five_cov_2,
         fits = c("cfi", "chisq", "rmsea", "rmsea.pvalue", "srmr"), 
         file = "cfa_tables/fit_cov_2_full.tex") 
  
  
  semTable(fit_cov_3_full, type = "latex",
         columns = c("est", "se", "p"),
         columnLabels = c(est = "Estimate",
                          se = "Std. Err.",
                          p = "p-value"),
         paramSets = c("loadings", "residualcovariances"),
         varLabels = vlabs_five_cov_3,
         fits = c("cfi", "chisq", "rmsea", "rmsea.pvalue", "srmr"), 
         file = "cfa_tables/fit_cov_3_full_paper.tex") 
  
  
  semTable(fit_cov_3_full, type = "latex",
         columns = c("est", "se", "p"),
         columnLabels = c(est = "Estimate",
                          se = "Std. Err.",
                          p = "p-value"),
         varLabels = vlabs_five_cov_3,
         fits = c("cfi", "chisq", "rmsea", "rmsea.pvalue", "srmr"), 
         file = "cfa_tables/fit_cov_3_full_appendix.tex") 
  
  

# We create tables for the multigroup. 
# We only show the models with 3 covarinces in the main text. The rest we show in the Appendix. 
# We use `estsestars` for the multigroup to save space in the table.
  

semTable(fit_mgm_basic, type = "latex",
         columns = c("estsestars"),
         paramSets = c("loadings", "intercepts", "residualvariances", "latentvariances"),
         varLabels = vlabs_five_basic,
         fits = c("cfi", "chisq", "rmsea", "rmsea.pvalue", "srmr"), 
         file = "cfa_tables/fit_mgm_basic.tex") 

semTable(fit_mgm_cov_1, type = "latex",
         columns = c("estsestars"),
         paramSets = c("loadings", "intercepts", "residualcovariances", "residualvariances", "latentvariances"),
         varLabels = vlabs_five_cov_1,
         fits = c("cfi", "chisq", "rmsea", "rmsea.pvalue", "srmr"), 
         file = "cfa_tables/fit_mgm_cov_1.tex") 

semTable(fit_mgm_cov_2, type = "latex",
         columns = c("estsestars"),
         paramSets = c("loadings", "intercepts", "residualcovariances", "residualvariances", "latentvariances"),
         varLabels = vlabs_five_cov_2,
         fits = c("cfi", "chisq", "rmsea", "rmsea.pvalue", "srmr"), 
         file = "cfa_tables/fit_mgm_cov_2.tex") 


semTable(fit_mgm_cov_3, type = "latex",
         columns = c("estsestars"),
         paramSets = c("loadings", "residualcovariances", "latentvariances"),
         varLabels = vlabs_five_cov_3,
         fits = c("cfi", "chisq", "rmsea", "rmsea.pvalue", "srmr"), 
         file = "cfa_tables/fit_mgm_cov_3_paper.tex")


semTable(fit_mgm_cov_3, type = "latex",
         columns = c("estsestars"),
         paramSets = c("loadings", "intercepts", "residualcovariances", "residualvariances", "latentvariances"),
         varLabels = vlabs_five_cov_3,
         fits = c("cfi", "chisq", "rmsea", "rmsea.pvalue", "srmr"), 
         file = "cfa_tables/fit_mgm_cov_3_appendix.tex") 


```

```{r}
#| echo: false
#| output: asis
#| tbl-cap: CFA with Three Residual Covariances (Unstandardized Estimates)
#| label: tbl-cfa3

# The table for the CFA model with three residual covariances. 
         
cat('
\\begin{table}
\\centering
\\footnotesize  
\\input{cfa_tables/fit_cov_3_full_paper.tex}
\\hspace*{.25cm}{\\parbox{0.6\\linewidth}{\\textit{Note}: All estimates are unstandardized.}}
\\end{table}
')

```

\vspace{10pt}

```{r}
#| echo: false
#| output: asis
#| tbl-cap: Multigroup CFA with Three Residual Covariances (Unstandardized Estimates)
#| label: tbl-cfa-mg-3


# The table for the multigroup model with three residual covariances.

cat('
\\begin{table}
\\centering
\\footnotesize  
\\input{cfa_tables/fit_mgm_cov_3_paper.tex}
\\hspace*{-1.5cm}{\\parbox{0.6\\linewidth}{\\textit{Note}: All estimates are unstandardized.}}
\\end{table}
')

```

::: {.latex}

\doublespacing

:::
  
## Invariance Test 

The summary statistics table (@tbl-summary-statistics) and analysis of the latent mean (@tbl-latent-mean) indicate a decline in populism between the 2018 and 2023 waves. However, before interpreting these changes as meaningful, we must test whether experts in both waves interpreted the populism construct similarly, based on the five core items. We conduct an invariance test on our CFA models to assess four levels of invariance: configural, metric, scalar, and strict. Configural invariance confirms that factor loadings follow the same pattern across waves. Metric invariance means factor loadings are equivalent, scalar invariance ensures item intercepts are equal, and strict invariance requires that residual variances match across waves.

Our invariance test begins with the baseline model (without residual covariances) and proceeds to the final models with three residual covariances. The results for all models are shown in @tbl-invariance-all-models in the Online Appendix. The baseline model achieves scalar invariance, while the models with one, two, and three residual covariances show evidence for strict invariance. This provides strong evidence that the populism measure is consistent across waves, even for the baseline model. Since all models show configural invariance, the factor loadings, including the number of factors and their patterns, remain consistent between waves. All models also demonstrate metric invariance, confirming that factor loadings are comparable between waves. Additionally, scalar invariance across models suggests that the item intercepts are equal across waves. The models which include residual covariances show evidence for strict invariance, indicating that the residual variances match across the two waves. As such, we conclude that changes in levels of populism between waves likely reflect real changes in populism. 

In sum, the combination of the CFA models and the invariance tests validate the latent construct of the multidimensional measure. The model with three residual covariances produces an excellent fit and the populism measure is invariance across the two waves. These findings show the additional benefits of the multidimensional and the continuous nature of our measurement. Only by constructing a latent variable, i.e., through the use of the sub-dimensions of populism, is it possible to test the extent to which the populist construct is invariant. Hence, to our knowledge, POPPA is the only existing populism scale that manages to achieve strict invariance.

::: {.latex}

\singlespacing

:::

```{r}
#| echo: false

# We run the invariance tests now but we only use them in the Appendix. 
# We do this to avoid variables colliding with CFA's when we create the populism CFA variable from the regression scores. 

# We run the models for the Baseline Model.
configural_basic <- cfa(cfa_basic_full, data = df_integrated, group = "wave")
metric_basic <- cfa(cfa_basic_full, data = df_integrated, group = "wave", group.equal = "loadings")
scalar_basic <- cfa(cfa_basic_full, data = df_integrated, group = "wave", group.equal = c("loadings", "intercepts"))
strict_basic <- cfa(cfa_basic_full, data = df_integrated, group = "wave", group.equal = c("loadings", "intercepts", "residuals"))

lavTestLRT_results_basic <- lavTestLRT(configural_basic, metric_basic, scalar_basic, strict_basic)
lavTestLRT_results_basic_df <- as.data.frame(lavTestLRT_results_basic)
lavTestLRT_results_basic_df <- lavTestLRT_results_basic_df %>% 
  tibble::rownames_to_column("model_raw") %>%
  mutate(Model = case_when(
    model_raw == "configural_basic" ~ "Configural",
    model_raw == "metric_basic"     ~ "Metric",
    model_raw == "scalar_basic"     ~ "Scalar",
    model_raw == "strict_basic"     ~ "Strict",
    .default                        = model_raw
  )) %>%
  select(Model, everything(), -model_raw)

# We run the models for the one residual covariance model.
configural_cov_1 <- cfa(cfa_cov_1_full, data = df_integrated, group = "wave")
metric_cov_1 <- cfa(cfa_cov_1_full, data = df_integrated, group = "wave", group.equal = "loadings")
scalar_cov_1 <- cfa(cfa_cov_1_full, data = df_integrated, group = "wave", group.equal = c("loadings", "intercepts"))
strict_cov_1 <- cfa(cfa_cov_1_full, data = df_integrated, group = "wave", group.equal = c("loadings", "intercepts", "residuals"))

lavTestLRT_results_cov_1 <- lavTestLRT(configural_cov_1, metric_cov_1, scalar_cov_1, strict_cov_1)
lavTestLRT_results_cov_1_df <- as.data.frame(lavTestLRT_results_cov_1)
lavTestLRT_results_cov_1_df <- lavTestLRT_results_cov_1_df %>% 
  tibble::rownames_to_column("model_raw") %>%
  mutate(Model = case_when(
    model_raw == "configural_cov_1" ~ "Configural",
    model_raw == "metric_cov_1"     ~ "Metric",
    model_raw == "scalar_cov_1"     ~ "Scalar",
    model_raw == "strict_cov_1"     ~ "Strict",
    .default                        = model_raw
  )) %>%
  select(Model, everything(), -model_raw)


# We run the models for the two residual covariance model.
configural_cov_2 <- cfa(cfa_cov_2_full, data = df_integrated, group = "wave")
metric_cov_2 <- cfa(cfa_cov_2_full, data = df_integrated, group = "wave", group.equal = "loadings")
scalar_cov_2 <- cfa(cfa_cov_2_full, data = df_integrated, group = "wave", group.equal = c("loadings", "intercepts"))
strict_cov_2 <- cfa(cfa_cov_2_full, data = df_integrated, group = "wave", group.equal = c("loadings", "intercepts", "residuals"))

lavTestLRT_results_cov_2 <- lavTestLRT(configural_cov_2, metric_cov_2, scalar_cov_2, strict_cov_2)
lavTestLRT_results_cov_2_df <- as.data.frame(lavTestLRT_results_cov_2)
lavTestLRT_results_cov_2_df <- lavTestLRT_results_cov_2_df %>% 
  tibble::rownames_to_column("model_raw") %>%
  mutate(Model = case_when(
    model_raw == "configural_cov_2" ~ "Configural",
    model_raw == "metric_cov_2"     ~ "Metric",
    model_raw == "scalar_cov_2"     ~ "Scalar",
    model_raw == "strict_cov_2"     ~ "Strict",
    .default                        = model_raw
  )) %>%
  select(Model, everything(), -model_raw)


# We run the models for the three residual covariance model.
configural_cov_3 <- cfa(cfa_cov_3_full, data = df_integrated, group = "wave")
metric_cov_3 <- cfa(cfa_cov_3_full, data = df_integrated, group = "wave", group.equal = "loadings")
scalar_cov_3 <- cfa(cfa_cov_3_full, data = df_integrated, group = "wave", group.equal = c("loadings", "intercepts"))
strict_cov_3 <- cfa(cfa_cov_3_full, data = df_integrated, group = "wave", group.equal = c("loadings", "intercepts", "residuals"))

lavTestLRT_results_cov_3 <- lavTestLRT(configural_cov_3, metric_cov_3, scalar_cov_3, strict_cov_3)
lavTestLRT_results_cov_3_df <- as.data.frame(lavTestLRT_results_cov_3)
lavTestLRT_results_cov_3_df <- lavTestLRT_results_cov_3_df %>% 
  tibble::rownames_to_column("model_raw") %>%
  mutate(Model = case_when(
    model_raw == "configural_cov_3" ~ "Configural",
    model_raw == "metric_cov_3"     ~ "Metric",
    model_raw == "scalar_cov_3"     ~ "Scalar",
    model_raw == "strict_cov_3"     ~ "Strict",
    .default                        = model_raw
  )) %>%
  select(Model, everything(), -model_raw)


# This is for the table with all of the models. 

# We add a column to indicate the section.
lavTestLRT_results_basic_df <- lavTestLRT_results_basic_df %>% mutate(Section = "Model: Baseline")
lavTestLRT_results_cov_1_df <- lavTestLRT_results_cov_1_df %>% mutate(Section = "Model: Res. Cov. 1")
lavTestLRT_results_cov_2_df <- lavTestLRT_results_cov_2_df %>% mutate(Section = "Model: Res. Cov. 2")
lavTestLRT_results_cov_3_df <- lavTestLRT_results_cov_3_df %>% mutate(Section = "Model: Res. Cov. 3") 


# We create a table of the invariance test results. 
# We do this here to keep everything together.
# We use this in the Appendix.


# We combine all dataframes.
combined_lavTestLRT_results <- bind_rows(
  lavTestLRT_results_basic_df,
  lavTestLRT_results_cov_1_df,
  lavTestLRT_results_cov_2_df,
  lavTestLRT_results_cov_3_df
)

# We ensure correct ordering of columns.
combined_lavTestLRT_results <- combined_lavTestLRT_results %>%
  rename(`Model Group` = Section) %>% 
  select(`Model Group`, Model, Df, AIC, BIC, Chisq, `Chisq diff`, `Df diff`, `Pr(>Chisq)`)


```

::: {.latex}

\doublespacing

:::

## Item Response Theory 
Our analyses thus far confirm the scale’s high quality, yet CFA offers limited insight into the discriminatory power of the five sub-dimensions of populism or the extent to which each item contributes to the overall latent structure. To address this, we apply item response theory (IRT), using a graded model appropriate for ordered data. Since graded models use discrete, ordered responses, we round the average expert judgments for each populist item to the nearest integer [@deayala2013; @samejima1969].

The IRT models yield several important insights.^[Following standard practice, we set Theta values between -3 and 3. The actual IRT values for this data range between -1.9708 and 2.5232.] The Item Information plot (see Plot 1 in @fig-irt-model-combined) shows each item’s contribution to the latent populist construct. The sharp peaks for *anti-elitism* and *people-centrism*, and to a lesser extent *general will*, suggest these items effectively discriminate among parties with different levels of populism. Meanwhile, the *manichean* and *indivisible* items, with lower peaks, provide broader curves, capturing information across lower populism levels. This is a common observation in populist attitudes research [@castanhosilva2020].

::: {.latex}

\singlespacing

:::

```{r}
#| echo: false
#| eval: true
#| include: false

# We run IRT models for the whole dataset. 

df_irt_five_full <- df_integrated %>%
  select(manichean,                
         generalwill,       
         peoplecentrism,    
         antielitism,
         indivisible)

df_irt_five_full <- na.omit(df_irt_five_full)

# We round the values to the nearest integer.  

df_irt_five_full <- df_irt_five_full %>%
  mutate(across(everything(), round)) 

# We fit the model using the GRM for polytomous (ordered) items.

model_graded <- 'Populism = 1-5'

fit_5_full <- mirt(data = df_irt_five_full, model = model_graded, itemtype = "graded")


```

```{r}
#| echo: false
#| fig-width: 12
#| fig-height: 10
#| label: fig-irt-model-combined

# We create a function for the Item Information plots. 
# Since we render the table a few times with different versions of the dataset, the function eliminates possible overlaps. 

# Function to extract information data.

extract_info_data <- function(fit_model, theta_values, item_names) {
  info_results <- data.frame()
  for (i in 1:extract.mirt(fit_model, 'nitems')) {
    item <- extract.item(fit_model, i)
    info_values <- iteminfo(item, Theta = theta_values)
    info_df <- data.frame(Theta = theta_values, Information = info_values, Item = item_names[i])
    info_results <- bind_rows(info_results, info_df)
  }
  return(info_results)
}

# We create a function for the Item Probability Function. 
# Since we render the table a few times with different versions of the dataset, the function eliminates possible overlaps. 

# Function to extract probability data.

extract_prob_data <- function(fit_model, theta_values, item_names) {
  results <- data.frame()
  
  for (i in 1:extract.mirt(fit_model, 'nitems')) {
    item <- extract.item(fit_model, i)
    probabilities <- probtrace(item, Theta = theta_values)
    item_df <- as.data.frame(probabilities)
    item_df$Theta <- theta_values
    item_df$Item <- item_names[i]
    
# We reshape the data to long format.
    
    item_df <- item_df %>%
      pivot_longer(cols = starts_with("P"), 
                   names_to = "Category", 
                   values_to = "Probability")
    
# We convert the Category to a factor with specified levels.
    
    item_df$Category <- factor(item_df$Category, levels = paste0("P.", 1:11))
    
    results <- bind_rows(results, item_df)
  }
  
  return(results)
}


# We render full dataset for the Item Information plots.
# We define the sequence of theta values and item names for the full dataset.
# These will be reused for both the Item Information and Item Probability calculations.

theta_values_full <- seq(-3, 3, by = 0.1)
item_names_full <- colnames(extract.mirt(fit_5_full, 'data'))

info_results_full <- extract_info_data(fit_5_full, theta_values_full, item_names_full)

# We plot the information data.

plot_item_inf_full <- info_results_full %>%
  mutate(Item = forcats::fct_recode(Item, !!!populist_items), 
         Item = forcats::fct_relevel(Item, !!!populism_items_order)) %>%
  ggplot(aes(x = Theta, y = Information, color = Item)) +
  geom_line(linewidth = 1) +
  facet_wrap(~ Item) +
  theme_bw() +
  labs(title = "Item Information (Full Dataset)",
       x = expression(theta),
       y = "Information") +  
  theme(plot.title = element_text(hjust = 0.5), 
        legend.position = "bottom",
        legend.direction = "horizontal",
        legend.box = "horizontal") +  
  guides(color = guide_legend(title = "Items", nrow = 1), 
         linetype = guide_legend(title = "Items", nrow = 1))


# We render full dataset for Item Probability Function. 
# We reuse the theta_values_full and item_names_full variables defined above.

results_full <- extract_prob_data(fit_5_full, theta_values_full, item_names_full)


# We plot the probability data.

# We create a vector so that the colours are the same between plots for the levels. 

category_levels <- c("P.1", "P.2", "P.3",  "P.4",  "P.5",  "P.6",  "P.7",  "P.8", "P.9",  "P.10", "P.11")


# We named the vector for the colors.

category_colors <- setNames(custom_palette, category_levels)


lty_vector <- c(rep("solid", 7), rep("dashed", 4))

plot_ipf_full <- results_full %>%
  mutate(Item = forcats::fct_recode(Item, !!!populist_items), 
         Item = forcats::fct_relevel(Item, !!!populism_items_order)) %>%
  ggplot(aes(x = Theta, y = Probability, color = Category, linetype = Category)) +
  geom_line(linewidth = 1) +
  facet_wrap(~ Item, scales = "free_y") +
  theme_bw() +
  labs(title = "Item Probability Function (Full Dataset)",
       x = expression(theta),
       y = expression(P(theta)),
       color = "Items") +
  scale_color_manual(values = category_colors) +
  scale_linetype_manual(values = lty_vector) +  
  theme(plot.title = element_text(hjust = 0.5), 
        legend.position = "bottom",
        legend.direction = "horizontal",
        legend.box = "horizontal") +  
  guides(color = guide_legend(title = "Items", nrow = 1), 
         linetype = guide_legend(title = "Items", nrow = 1))

# We combine the plots.

plot_item_inf_full / plot_ipf_full +
  plot_annotation(tag_levels = '1')

```

::: {.latex}

\doublespacing

:::

\vspace{10pt}

Plot 2 from @fig-irt-model-combined presents the Item Probability Function for each sub-dimension. Some data loss is observed due to rounding, particularly at the highest scale levels, shown by the dashed lines (values 8–11).^[Each dimension is measured on an 11-point scale from 0 to 10. The mirt package used to run these models scales the plots from 1 to 11.] The Item Probability Function reveals the levels at which parties are likely to endorse the latent trait and indicates how effectively each item differentiates between levels of the corresponding dimension. As expected, higher levels (8–10, equivalent to 7–9 on the original scale) contribute significantly to higher scores on the five sub-dimensions. Medium and lower levels function as anticipated, demonstrating that the sub-dimensions capture information well across the entire scale range.

In summary, the multidimensional and latent nature of populism necessitates using multiple indicators to measure populism accurately. Factor analyses (EFA and CFA) confirm the contributions of each dimension to the latent construct, while invariance tests validate the measure’s consistency across waves. Finally, item response theory demonstrates the discriminatory character of the five sub-dimensions and highlights each sub-dimension’s contribution to the populism latent structure.

::: {.latex}

\singlespacing

:::

```{r}
#| echo: false

# Create the populism variable from the factor scores. 

# We first make the populism variable from the regression scores from the CFA. 
# We attach the regression scores from the model with three residual covariances from the full dataset.
# We use the function from above. 

df_integrated <- attach_factor_scores(df_integrated, fit_cov_3_full)

# We rescale the variable

df_integrated$populism_cfa_resc <- scales::rescale(df_integrated$populism_cfa_cov_3, to = c(0, 10))

```

::: {.latex}

\doublespacing

:::

# Populism: Trends and Transformations

An advantage of the continuous and comprehensive approach is it allows us to assess the degree of populism across all relevant parties, i.e. all parties that have seats in parliament. In the following section, we explore how populist ideation manifests itself across different parties and across the two waves of the expert survey.^[In the analyses below, we use the regression scores from the CFA model with 3 residual covariances. We rescale the CFA regression scores from 0 to 10. As such, caution is needed when comparing the rescaled CFA regression scores with the mean of the five items. Rescaling the CFA regression scores stretches the scale, slightly inflating, or in some case deflating, the values. The advantage of the CFA regression scores is, however, that they account for measurement error, as indicated by the better fit achieved when using residual covariances. Moreover, the CFA regression scores assign different weights to the individual factor loadings. Items with higher loadings are given more weight. In the Online Appendix, we also run the analysis with the populism variable created from the mean of the five items. See @tbl-summary-statistics-full-appendix for descriptives regarding populism CFA variable, and for comparisons with the mean populism variable.] 

## Party Families and Populism

@fig-violin-plots uses violin plots to group political parties by populism according to party families, grouped by region. The violin plots show the density of the data points with the coloured dots representing political parties within the party families and the red dot representing the median. A wider plot indicates increased density and more concentrated values around the middle point of the data, while a more elongated violin plot indicates a broader distribution of the data points.

@fig-violin-plots demonstrates that the radical right party family is the most populist of all party families. The median level of populism is the highest for the radical right, while also the values are the most concentrated for this party family. Several parties, however, distinguish themselves from the rest. This is especially the case at the lower end of the scale, indicating that there are some parties that are considered radical right which are less populist. Turning to the regional grouping of the parties, we see that the radical right parties that tend to score lower on the populism scale can be found in Central and Eastern Europe (CEE). 

The radical left also has a higher median of populism than the other party families. However, in the case of the radical left there is more diversity among the parties regarding populism. The parties are less concentrated on the populism spectrum, since more parties score both considerably higher and lower along the populism scale. The radical left parties that score lower on the populism scale tend to be located in Southern Europe. 

The continuous measure also allows for the assessment of the unusual suspects. @fig-violin-plots shows that the Christian democratic, green and liberal party families have a lower median of populism. Again, the plot also highlights that there are higher values on both ends of the scale. Parties in CEE countries, once again, have a tendency to divert from the norm. A handful of Christian democratic, social democratic and liberal parties in CEE countries score higher on the populism scale than other parties in their respective party families. 

::: {.latex}

\singlespacing

:::

```{r}
#| echo: false
#| warning: false
#| fig-height: 6
#| fig-width: 6
#| fig-cap: "Degrees of Populism per Party Family"
#| label: fig-violin-plots


# We create a violin plot for populism per party family. 

# We create a vector to reorder the party families. 

family_reorder <- c("Radical Right",
                    "Radical Left",
                    "Christian Democratic",
                    "Social Democratic",
                    "Conservatives",
                    "Green",
                    "Liberal",              
                    "Regionalist",
                    "Confessional",
                    "Agrarian",             
                    "No family") 



# This is the violin plot.

local({
  set.seed(1234)  # We use a seed to ensure that the labels are the same every time we render the plot. This seed only affects this block.

df_integrated %>%
  mutate(family_label = forcats::fct_rev(forcats::fct_relevel(family_label, family_reorder))) %>% 
  dplyr::filter(family_label != "Confessional") %>%
  group_by(family_label) %>%
  mutate(
    lower_bound = quantile(populism_cfa_resc, 0.10, na.rm = TRUE),  
    upper_bound = quantile(populism_cfa_resc, 0.90, na.rm = TRUE)   
  ) %>%
  ungroup() %>%
  ggplot(aes(x = family_label, y = populism_cfa_resc, colour = regions)) + #fill = family_label, 
  geom_violin(trim = TRUE, color = "black") +
  geom_jitter(shape = 16, position = position_jitter(0.2)) +
  geom_text_repel(aes(label = ifelse(populism_cfa_resc < lower_bound | populism_cfa_resc > upper_bound, 
                                     party_short, 
                                     NA),
                      color = regions),  # Color text by region
                  size = 2, 
                  max.overlaps = 30,
                  show.legend = FALSE) +  
  
  stat_summary(fun = median, geom = "point", shape = 23, size = 3, fill = "red", color = "black") +
  coord_flip() +
  scale_y_continuous(limits = c(0, 10.5)) + 
  theme_bw() +
  labs(x = "",
       y = "",
       colour = "Regions") +
  guides(
    fill = "none"
  ) +
  theme(
    legend.background=element_blank(),
    plot.tag = element_text(size = 16),
    plot.title = element_text(hjust = 0.5),
    axis.title.y = element_text(size = 16),
    axis.title.x = element_text(size = 16),
    legend.position = "inside",
    legend.position.inside = c(0.1, 0.88))

})

```

::: {.latex}

\doublespacing

:::

Given the temporal nature of the data we also use descriptive analysis to examine change over time. @fig-party-families-boxplots uses box plots to compare changes in the degree of populism between the waves among party families. We only include parties that were included in both waves. Higher values indicate higher populism scores for Wave 2 versus Wave 1. The dotted line indicates no change, while the solid line in the box indicates the median. We once again group the party parties by region. 

@fig-party-families-boxplots highlights several patterns. First, while there has been no change in the median level of populism for radical right parties, the box representing the interquartile range for the radical right has increased, indicating that there has been a slight increase in the degrees of populism for this party family. There is also considerable variation between radical right parties. Law and Justice (Pis) in Poland and Jobbik in Hungary have become marginally less populist, while the Sweden Democrats (SD) and the Slovak National Party (SNS) in Slovakia have become more populist. 

For the rest of the party families, evident from the movement of the interquartile range, most party families have become less populist. Of particular note is the degree to which the radical left has become less populist, both in terms of the interquartile range and the median. The Bloc of the Left (BE) and the Unitary Democratic Coalition (CDU) in Portugal have become particularly less populist. It is, moreover, noteworthy the degree to which the conservatives, green and the liberal party families, in terms of interquartile range and median, have become less populist. In particular parties, such as the Renaissance party (RE / LREM) in France and the New Austria party (NEOS), both from the liberal party family, have become less populist, while the Austrian People's Party (ÖVP), from the Christian democratic party family also became less populist. 

::: {.latex}
\singlespacing
:::

```{r}
#| echo: false
#| fig-width: 6
#| fig-height: 6
#| fig-cap: "Comparing Populism per Waves per Party Family"
#| label: fig-party-families-boxplots


# We make a dataset with parties that appear in both waves.
# We then create populism variable from the CFA from the dataset where parties appear in both waves.

# We filter the parties that are in Wave 1 and Wave 2.

df_parties_both_waves_filter <- df_integrated %>%
  group_by(poppa_id) %>%
  dplyr::summarize(wave_count = n_distinct(wave, na.rm = TRUE)) %>%
  dplyr::filter(wave_count == 2) %>%
  select(poppa_id)

# We create a dataset with only the parties that appear in both waves.

df_parties_both_waves <- semi_join(df_integrated, df_parties_both_waves_filter, by = join_by(poppa_id))


# We remove CFA variables form the original dataset to avoid confusion. 

df_parties_both_waves <- df_parties_both_waves %>%
  select(-c(populism_cfa_cov_3, populism_cfa_resc))

# We run a CFA with for three residual covariances with the parties that appear in both datasets. 

cfa_cov_3_both <- '

populism_cfa_cov_3_both =~ manichean + generalwill + peoplecentrism + antielitism + indivisible

generalwill ~~ indivisible
peoplecentrism ~~ antielitism
manichean ~~  antielitism

'

# We fit the model to the data.

fit_cov_3_both <- cfa(model = cfa_cov_3_both, data = df_parties_both_waves)


# We use the function from above to attach the factor scores.

df_parties_both_waves <- attach_factor_scores(df_parties_both_waves, fit_cov_3_both)


# We rescale the factor scores.

df_parties_both_waves$populism_cfa_both_resc <- scales::rescale(df_parties_both_waves$populism_cfa_cov_3_both, to = c(0, 10))


# We mean center nativism, authoritarian and lrecon for the interaction effects below. 

df_parties_both_waves <- df_parties_both_waves %>% 
  mutate(nativism_centered = nativism - mean(nativism, na.rm =TRUE),
         authoritarian_centered = authoritarian - mean(authoritarian, na.rm = TRUE),
         lrecon_centered = lrecon - mean(lrecon, na.rm = TRUE) 
  )


# We create the populism difference between the two waves.
# We use the base R diff function to do this. 
# This creates an new variable with the difference between the second and the first wave.
# The diff function is calculated as Wave 2 minus Wave 1.
# A positive value indicates that populism increased in Wave 2 compared to Wave 1.
# The arrange function is important to ensure that poppid and wave are sorted correctly. 
# We use the new reframe function from dplyr: unlike the summarize function, the reframe function does not eliminate the other rows. 
# We include party names, but only for parties in the top and bottom 10th percentile. We do this so as not to clutter the plot. 


local({
  set.seed(1234)  # We use a seed to ensure that the labels are the same every time we render the plot. This seed only affects this block.
  
  df_parties_both_waves %>%
  dplyr::filter(family_label != "Confessional") %>%
  dplyr::filter(!is.na(populism_cfa_both_resc)) %>% 
  group_by(family_label, poppa_id, party_short, regions) %>%
  arrange(poppa_id, wave) %>%
  reframe(diff_populism = diff(populism_cfa_both_resc)) %>%
  mutate(family_label = forcats::fct_relevel(as.factor(family_label), family_reorder),
         family_label = forcats::fct_rev(family_label)) %>%
  group_by(family_label) %>%
  mutate(Q1 = quantile(diff_populism, 0.10, na.rm = TRUE),
         Q3 = quantile(diff_populism, 0.90, na.rm = TRUE),
         is_outside_box = diff_populism < Q1 | diff_populism > Q3) %>%
  ungroup() %>%
  ggplot(aes(y = family_label, x = diff_populism)) +  
  geom_boxplot() +  
  geom_point(size = 2, alpha = 0.8, aes(group = family_label, colour = regions), stroke = 0.5) +  
  geom_vline(xintercept = 0, linetype = "dashed", color = "red") +  
  geom_text_repel(data = . %>% dplyr::filter(is_outside_box) %>%
                    distinct(family_label, diff_populism, .keep_all = TRUE),
                  aes(label = paste0(party_short)),
                  size = 3, box.padding = 0.6, point.padding = 0.3, max.overlaps = Inf) +  
  theme_bw() +  
  labs(y = "", x = "Positive values indicate higher levels of populism in Wave 2", colour = "Regions") +  
  guides(fill = "none") +
  theme(legend.position = "inside",
        legend.position.inside = c(0.1, 0.3),
        legend.background=element_blank())
})

```

::: {.latex}

\doublespacing

:::

## The Evolution of Populism: Predictors Across Time

The nature of the data also allows us to focus on the predictors of populism. In this section, we once again focus on the contemporaneous and the temporal nature of the data. We begin with an OLS regression analysis on party positions. We are interested in the degree to which dimensions, i.e. attaching ideologies, continue to predict populism. To do so, we run four OLS regression models (See: @tbl-pop-char-ols). Given that we are also interested in differences between Wave 1 an Wave 2, we include a Wave-dummy variable in all models. All models include country fixed effects.

Model 1 regresses populism on nativism. Model 1 identifies a statistically significant and positive relationship between nativism and populism. Higher levels of nativism are stronger predictors of populism. The coefficient for the wave dummy is also statistically significant and negative, implying that Wave 2 is less populist than Wave 1. Model 2 regresses populism on left-right economic positions. The coefficient for economic left-right positions is negative and statistically significant, indicating that parties that hold more left-wing economic positions are more likely to score higher on the populism scale. The coefficient for wave is negative but not statistically significant at the 0.05 point level. Model 3 includes a variable for authoritarianism, with higher values indicating that parties are more authoritarian.^[The authoritarian variable combines the lifestyle and law and order items. The item measuring lifestyle is reversed and combined with the law and order item, creating an item that measures authoritarianism (See @tbl-correlations-attaching-ideologies for item correlations).] Model 3 shows a positive and statistically significant effect for authoritarianism: an increase in authoritarianism indicates a greater likelihood that parties will be more populist. The coefficient for wave is negative and statistically significant. 

Model 4 includes nativism and left-right economic positions. Since we include nativism in the model, due to issues of multicollinearity, we do not include authoritarianism. Both nativism and left-right economic positions are statistically significant. Nativism demonstrates a positive effect and the left-right measure of economic positions shows a negative effect, indicating that the less right-wing a party is, the greater the likelihood the party scores higher on the populism scale. Interestingly, the coefficient for the left-right economic variable more than doubles, compared to the value in Model 2. Including nativism in the model isolates the effect of left-right economic positions, emphasizing that more left-wing economic positions are stronger predictors of populism, when controlling for nativism. In addition, when nativism and authoritarianism are included in the models, the coefficient for wave is negative and it is statistically significant.

::: {.latex}

\singlespacing

:::

```{r}
#| echo: false
#| tbl-cap: Populist Party Characteristics
#| tbl-cap-location: top
#| label: tbl-pop-char-ols


# We rename the variables and define which variables to show in the output for the regression table. 

cm_char <- c(
  'nativism' = 'Nativism',
  'authoritarian' = 'Authoritarianism',
  'lrecon' = 'Left-Right Economy',
  'waveWave 2 - 2023' = 'Wave 2',
  '(Intercept)' = '(Intercept)'
)


# We run the OLS models for positions.

models_pop_char <- list(
  "Model 1"  = lm(populism_cfa_resc ~ nativism + country + wave, data = df_integrated),
  "Model 2"  = lm(populism_cfa_resc ~ lrecon + country + wave, data = df_integrated),
  "Model 3"  = lm(populism_cfa_resc ~ authoritarian + country + wave, data = df_integrated),
  "Model 4"  = lm(populism_cfa_resc ~ nativism + lrecon + country + wave, data = df_integrated)

  )



modelsummary::modelsummary(models_pop_char, 
                           estimate = "{estimate}{stars}",                           
                           output = 'kableExtra',
                           coef_map = cm_char,
                           gof_map = c("nobs", "r.squared")) %>%
  kable_styling(font_size = 7) %>% 
  kableExtra::add_footnote(
    label = c("With country fixed effects", 
              "Standard errors are shown in parentheses",
              "+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001"),
    notation = "none"
  )


```

::: {.latex}

\doublespacing

:::

All in all,  Models 1-4 illustrate that nativism, authoritarianism, and more left-wing economic positions are strong predictors of populism, also indicated by the high R2 for both Model 1 and Model 4 [also see @huber2023]. In addition, the models also highlight the temporal element of the data. The coefficient for wave in Model 1, 3 and 4 is negative and statistically significant when nativism and authoritarianism are included in the models. In other words, after controlling for nativism or authoritarianism, political parties became less populist in Wave 2, suggesting that parties with lower levels of nativism and authoritarianism have become less populist over the two waves.

## Populism Across Waves: Shifts and Continuities

The results thus suggest that there are differences in parties' degree of populism between the waves. We now examine more in depth how populism has decreased (or increased) between the waves. @tbl-interaction-wave shows 7 OLS regression models. The first model regresses populism on the wave dummy. Models 2 and 3 regress populism on nativism, left-right economic positions with interaction terms for the wave dummy and nativism and for the wave dummy and left-right economic positions. In Model 4, we interact the wave dummy with authoritarianism. Models 5, 6 and 7 reproduce these models, however, we run these models on data that only includes parties that appear in both waves. We do so to test the degree to which our results from the first four models are driven by sample composition. It could, for example, be that changes between the waves emanates from the fact that we have different parties in Wave 2 than in Wave 1. Nativism, left-right economy and authoritarianism are mean-centered in all models.  

::: {.latex}
\singlespacing
:::

```{r}
#| echo: false
#| tbl-cap: Interaction with Wave
#| label: tbl-interaction-wave

# We now run the regressions for the full dataset and for the dataset with only parties that appear in both waves. 

# We rename the variables and we define which variables to include in the regression table. 

cm_wave_centered <- c(
           'waveWave 2 - 2023' = 'Wave 2',
           'nativism_centered' = 'Nativism (Ctr)',
           'lrecon_centered'  = 'LR Economy (Ctr)',
           'authoritarian_centered' = 'Authoritarianism (Ctr)',
           'waveWave 2 - 2023:nativism_centered' = 'Wave 2 x Nativism (Ctr)',
           'waveWave 2 - 2023:lrecon_centered' =  'Wave 2 x LR Economy (Ctr)',
           'waveWave 2 - 2023:authoritarian_centered' = 'Wave 2 x Authoritarianism (Ctr)',
           '(Intercept)' = '(Intercept)'
         )
         

#  We run the regression models for the full dataset and for the parties that appear in both waves.

models_interaction_wave <- list(
           "Model 1 (FD)" = lm(populism_cfa_resc ~ wave + country, data = df_integrated),
           "Model 2 (FD)" = lm(populism_cfa_resc ~ lrecon_centered + wave*nativism_centered + country, data = df_integrated),
           "Model 3 (FD)" = lm(populism_cfa_resc ~ nativism_centered + wave*lrecon_centered + country, data = df_integrated),
           "Model 4 (FD)" = lm(populism_cfa_resc ~ lrecon_centered + wave*authoritarian_centered + country, data = df_integrated),
           "Model 5 (DSP)" = lm(populism_cfa_both_resc ~ wave + country, data = df_parties_both_waves),
           "Model 6 (DSP)" = lm(populism_cfa_both_resc ~ lrecon_centered + wave*nativism_centered + country, data = df_parties_both_waves),
           "Model 7 (DSP)" = lm(populism_cfa_both_resc ~ lrecon_centered + wave*authoritarian_centered + country, data = df_parties_both_waves)
         )

modelsummary::modelsummary(models_interaction_wave, 
                           estimate = "{estimate}{stars}",
                           output = 'kableExtra',
                           coef_map = cm_wave_centered,
                           gof_map = c("nobs", "r.squared")) %>%
  kable_styling(font_size = 8) %>%
  column_spec(2:7, width = "5em") %>%  
  kableExtra::add_footnote(
    label = c("With country fixed effects; FD: Full dataset; DSP: Dataset with shared parties", 
              "Standard errors are shown in parentheses",
              "+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001"),
    notation = "none"
  )

```

::: {.latex}

\doublespacing

:::

The coefficient for wave in Model 1 is negative, implying that Wave 2 is less populist than Wave 1. The coefficient is, however, only statistically significant at a p-value of less than 0.10. The interaction term for wave and nativism in Model 2 is statistically significant with a p-value of less than 0.05. Nativism is a stronger predictor of populism in Wave 2. The interaction term for wave and left-right economic positions in Model 3 is not statistically significant. Model 4 also demonstrates a positive interaction effect for wave and authoritarianism, but only with a p-value of 0.10. Models 5 through 7 include only parties that are present in both waves. The coefficient for wave in model 5 is negative but not statistically significant. In Model 6 the interaction term for wave and nativism is positive, with a p-value of 0.051, i.e just above the common 0.05 threshold. For Model 7 the interaction term for wave and authoritarianism is not statistically significant.

::: {.latex}

\singlespacing

:::

\vspace{10pt}

```{r}
#| echo: false
#| fig-width: 6
#| fig-height: 10
#| fig-cap: "Marginal Effects for Interaction Effects"
#| label: fig-marginal-effects


# These are the marginal effects plots showing the effect of wave on populism, conditioned by (i.e., moderated by) nativism (centered) and authoritarianism (centered).

interaction_wave_nativism_me <- plot_slopes(models_interaction_wave$`Model 2`, variables = "wave", condition = "nativism_centered") +
labs(title = "Marginal Effects of Wave on Populism, Conditioned by Nativism (Full Dataset)",
     x = "Nativism (Centered)") + 
  geom_hline(yintercept = 0, linetype = "dashed", color = "red") 


interaction_wave_authoritarian_me <- plot_slopes(models_interaction_wave$`Model 4`, variables = "wave", condition = "authoritarian_centered") +
  labs(title = "Marginal Effects of Wave on Populism, Conditioned by Authoritarianism (Full Dataset)",
       x = "Authoritarianism (Centered)") + 
   geom_hline(yintercept = 0, linetype = "dashed", color = "red") 


interaction_wave_nativism_both_me <- plot_slopes(models_interaction_wave$`Model 6`, variables = "wave", condition = "nativism_centered") +
  labs(title = "Marginal Effects of Wave on Populism, Conditioned by Nativism (Parties in Both Waves)",
       x = "Nativism (Centered)") + 
   geom_hline(yintercept = 0, linetype = "dashed", color = "red") 
 


(
  interaction_wave_nativism_me + 
  interaction_wave_authoritarian_me +
  interaction_wave_nativism_both_me +
  plot_layout(nrow = 3, guides = "collect") +
  plot_annotation(tag_levels = '1') 
  ) &  
  theme_bw() &
  theme(plot.title = element_text(hjust = 0.5, size = 9),
        axis.title = element_text(size = 9),
        axis.text.y = element_text(size = 9),
        panel.grid = element_blank()) 


```

\vspace{10pt}

::: {.latex}

\doublespacing

:::

To better visualize the interaction effects, we plot the marginal effects and the predictions (@fig-marginal-effects; @fig-predictions). We visualize the interaction effects for both nativism and authoritarianism.^[Although the interaction effect for wave and authoritarianism are only statistically significant with a p-value of 0.10 and the interaction term for wave and nativism in model 6 is just above 0.05, following @brambor2006 we further investigate whether there is a statistically significant effect for a particular range of the variables.] The marginal effects plots show that the slope (magnitude) of nativism and authoritarianism on populism increases with Wave 2. However, the effect is not statistically significant for the higher values, the confidence intervals cross zero. The lower levels of nativism and authoritarianism are, however, statistically significant. In other words, the negative effect of wave 2 (in comparison with wave 1) is stronger for political parties that are less nativist and less authoritarian. These findings are statistically significant. The prediction plots visualize the same results in a different manner. These plots also highlight that lower levels of nativism and authoritarianism are less populist. However, these plots nicely show that this becomes even more pronounced in Wave 2. In other words, political parties that are less nativist and less authoritarian are less populist in Wave 2 than in Wave 1.

::: {.latex}

\singlespacing

:::

```{r}
#| echo: false
#| fig-width: 6
#| fig-height: 10
#| fig-cap: "Predictions for Interaction Effects"
#| label: fig-predictions


# These are the prediction plots showing the effect of wave on populism across levels of nativism (centered) and authoritarianism (centered).

interaction_wave_nativism_pred <- plot_predictions(models_interaction_wave$`Model 2`,
                 condition = list("wave", nativism_centered = "threenum")) +
  labs(
    title = "Predictions of Wave on Populism, by Nativism (Full Dataset)",
    x = "",
    y = "Populism",
    colour = "Moderator (Ctr)") 


interaction_wave_authoritarian_pred <- plot_predictions(models_interaction_wave$`Model 4`,
                 condition = list("wave", authoritarian_centered = "threenum")) +
  labs(
    title = "Predictions of Wave on Populism, by Authoritarianism (Full Dataset)",
    x = "",
    y = "Populism",
    colour = "Moderator (Ctr)") 

interaction_wave_nativism_both_pred <- plot_predictions(models_interaction_wave$`Model 6`,
                 condition = list("wave", nativism_centered = "threenum")) +
  labs(
    title = "Predictions of Wave on Populism, by Nativism (Parties in Both Waves)",
    x = "",
    y = "Populism",
    colour = "Moderator (Ctr)")


(
 interaction_wave_nativism_pred + 
 interaction_wave_authoritarian_pred + 
 interaction_wave_nativism_both_pred +
  plot_layout(nrow = 3, guides = "collect") +
  plot_annotation(tag_levels = '1') 
) &
  theme_bw() &
  theme(
    legend.position = "bottom",
    plot.title = element_text(hjust = .5, size = 10),
    axis.title = element_text(size = 10),
    axis.text.x = element_text(size = 10),
    legend.text = element_text(size = 10) 
  ) 


```

::: {.latex}

\doublespacing

:::

We draw three conclusions. First, the increased predictive power of nativism (and to a lesser degree for authoritarianism) for populism in Wave 2 over Wave 1 stems, in a large part, from the difference between the parties that score higher on nativism and authoritarianism and parties that score lower on nativism and authoritarianism. Second, we conclude that the level of populism among parties that are less nativist and less authoritarian has decreased in Wave 2, such as liberal, Green and radical left parties. These results confirm the findings from @tbl-pop-char-ols, showing that political parties were less populist in Wave 2 than in Wave 1, when controlling for nativism and authoritarianism. And third, these findings hold true for nativism when we include only parties that are in both waves, avoiding sample composition effects. In sum, we conclude that parties that are less nativist (and to a lesser degree less authoritarian) have become less populist in Wave 2 in comparison with Wave 1. 

# Conclusion 

This study developed and validated a robust, multidimensional measure of populism in European political parties over time, addressing the need for reliable longitudinal data on party-based populism. By introducing the second wave of the Populism and Political Parties Expert Survey (POPPA), we offer a comprehensive measure that captures populism in 312 parties across 31 European countries for 2023.

We have argued that populism is a complex latent phenomenon that spans multiple  dimensions, requiring a multidimensional approach with multiple indicators to capture it. Each indicator must capture essential elements of populism to ensure the quality of the latent construct. This approach allows us to account for populism’s attachment to diverse ideological leanings and dimensions. Thus, we posit that the multidimensional, continuous, and comprehensive approach is most suited for measuring and assessing populism's empirical implications.

The POPPA 2023 populism measure operationalizes the core elements of the ideational approach. Our analysis demonstrates that each of the five sub-dimensions in POPPA 2023 exhibits consistent loading patterns, ensures broad coverage, has strong discriminatory power, and strict measurement invariance across survey waves, supporting the validity of a multidimensional approach.

Our analysis confirms that radical right and radical left party families are more populist than other groups, while specific party families—such as Christian democratic, green, and liberal parties—score lower on the populism scale. Additionally, regional differences emerge: some radical right parties in Central and Eastern Europe are generally less populist than those elsewhere, while some radical left parties in Southern Europe exhibit higher levels of populism. Differences in populism among Christian democratic, social democratic, and liberal parties in Central and Eastern Europe suggest that regional factors influence how populism attaches to these parties. While our regional analysis is descriptive, we encourage further research into the role of POPPA 2023 in uncovering regional variations in populism.

We also use POPPA 2023 to explore the ideological predictors of populism. Nativism, authoritarianism, and more left-wing economic positions remain strong predictors of populism. The analysis of changes in degrees of populism between the two waves shows that there are overall declines in the level of populism between the waves. The descriptive analysis shows that conservatives, social democratic, Green and liberal parties have become less populist, while the OLS regression analysis demonstrates that political parties that are less nativist and less authoritarian are less populist in Wave 2. Nativism and authoritarianism are stronger predictors of populism in Wave 2. However, this effect is largely driven by the decline in levels of populism among parties that are less nativist and less authoritarian. 

In sum, this paper underscores the strengths of POPPA 2023’s multidimensional, continuous, and comprehensive approach. This approach facilitates in-depth comparisons across party families and regions and tracks changes in populism’s predictors between waves. We encourage researchers to leverage this multidimensional framework to investigate pressing questions on populism and political parties within what researchers refer to as the “populist Fourth Wave” (@mudde2017).

  
::: {.latex}

\singlespacing

:::

\theendnotes

# References

::: {#refs}
:::

::: {.latex}

\newpage

:::

# Online Appendix

\vspace{30pt}

::: {.latex}

\appendix
\renewcommand{\thefigure}{A\arabic{figure}}
\renewcommand{\thetable}{A\arabic{table}}
\setcounter{figure}{0}
\setcounter{table}{0}

:::

## Descriptive Statistics Full Dataset

```{r}
#| echo: false
#| warning: false
#| tbl-cap: Summary Statistics (Full Dataset)
#| label: tbl-summary-statistics-full-appendix


# Descriptive table for the full dataset, including the CFA populism variable. 
# we use the vtable package.

# We create a new dataset for the descriptive table. 
# We do this in case we need to do some extra data manipulations for the table. 

df_integrated_table_appendix <- df_integrated


# In the published paper we did not include the party families in the descriptive table. We include them here. 
# We do not include the Confessional parties since there were too few. There were only six. As such we have 555 parties in the party families. 


df_integrated_table_appendix <- df_integrated_table_appendix 

# We order the party families for better presentation 

df_integrated_table_appendix <- df_integrated_table_appendix %>%
  mutate(family_label = forcats::fct_relevel(family_label, !!!family_reorder))


# We round the following variables for better presentation in the descriptive table. 
# In the paper for aesthetic reasons we round to less decimal points than in the replication file.
# In the paper we round to 0 decimal points for these variables. 

df_integrated_table_appendix$populism_cfa_cov_3 <- round(df_integrated_table_appendix$populism_cfa_cov_3, digits = 2)  
df_integrated_table_appendix$lrecon_centered <- round(df_integrated_table_appendix$lrecon_centered, digits = 2) 
df_integrated_table_appendix$nativism_centered <- round(df_integrated_table_appendix$nativism_centered, digits = 2) 
df_integrated_table_appendix$authoritarian_centered <- round(df_integrated_table_appendix$authoritarian_centered, digits = 2) 

# We rename the variables. 

labs_appendix <- c(populism_mean = 'Populism (Mean)',
                   populism_cfa_resc = 'Populism (CFA: Rescaled)',
                   populism_cfa_cov_3 = 'Populism (CFA: Standardized)',
                   antielitism = 'Anti-Elitism',
                   generalwill = 'General Will',
                   manichean   = 'Manichean',
                   indivisible = 'Indivisible',
                   peoplecentrism = 'People Centrism',
                   lroverall = 'Left-Right Overall',
                   lrecon = 'Left-Right Economy',
                   immigration = 'Immigration',
                   nativism = 'Nativism',
                   authoritarian = 'Authoritarian',
                   laworder = 'Law and Order',
                   lifestyle = 'Lifestyle',
                   eu = 'European Integration',
                   saliencecult = 'Salience Culture',
                   salienceecon = 'Salience Economy',
                   lrecon_centered = 'Left-Right Economy (Ctr)',
                   nativism_centered = 'Nativism (Ctr)',
                   authoritarian_centered = 'Authoritarian (Ctr)',
                   family_label = 'Party Families',               # This was not included in the published paper. 
                   wave = 'Wave')


# We create the descriptive table with vtable and kableExtra.

table_appendix <- df_integrated_table_appendix %>%
  select(populism_mean, 
         populism_cfa_resc,
         populism_cfa_cov_3,
         antielitism,
         generalwill, 
         manichean,  
         indivisible, 
         peoplecentrism, 
         lroverall, 
         lrecon,
         immigration,
         nativism, 
         authoritarian,
         laworder,
         lifestyle,
         eu,
         saliencecult,
         salienceecon,
         lrecon_centered,
         nativism_centered,
         authoritarian_centered,
         family_label,          # This was not included in the published paper.
         wave) %>%
  mutate(wave = forcats::fct_rev(wave) ) %>%
  vtable::sumtable(
    labels = labs_appendix,
    summ = c('notNA(x)',  
             'mean(x)',   
             'sd(x)',     
             'min(x)',    
             'max(x)'),   
    summ.names = c('N', 'Mean/ Percent', 'Std. Dev.', 'Min', 'Max'), 
    out = 'return'       
  )

table_appendix %>%
  kable(
    format = "latex",  
    booktabs = TRUE,
    linesep = "",
    row.names = FALSE
  ) %>%
  kable_classic %>% 
  kable_styling(
    latex_options = "scale_down",   
    full_width = TRUE,             
    position = "center",           
    font_size = 8
  ) %>%
  column_spec(1, width = "16em") %>%   
  column_spec(2:5, width = "5em") %>% 
  footnote(general = "We do not include the Confessional parties in the analyses in the paper due to the low number of parties.",
         general_title = "",
         footnote_as_chunk = TRUE)


```

\newpage

## Additional CFA Analysis 

::: {.latex}

\doublespacing

:::

### Confirmatory Factor Analysis Models

Performing a Confirmatory Factor Analysis (CFA) is an iterative process, combining model fit, controlling the modification indices and theoretical justification. @tbl-cfa-full_unstd, @tbl-cfa-cov-1_unstd, @tbl-cfa-cov-2_unstd and @tbl-cfa-cov-3_unstd display the output for the CFA models. To assess the model fit we apply common accepted standards: CFI $\geq$ 0.95; RMSEA $\leq$ 0.05; RMSEA.PVALUE $\geq$ 0.05; SRMR $\leq$ 0.05), [See: @hu1999].

The baseline model (see: @tbl-cfa-full_unstd ) shows a poor fit: the CFI and the RMSEA are too high, and the p-value for the RMSEA is statistically significant. The only acceptable value is the SRMR which is 0.048. The first set of modification indices (see: @tbl-modification-indices-basic) indicate that the residuals between *generalwill ~~  indivisible* show high modification scores, as do *peoplecentrism  ~~  antielitism*. Both residual covariances can be theoretically justified. We choose *generalwill ~~  indivisible* for the first iteration. Both of these items tap into the idea of the homogeneous people. As such, we can expect shared variance between these two items. 

@tbl-cfa-cov-1_unstd shows the output for the model with one residual covariance. This model produces a better fit, but not an adequate fit. Although the CFI is higher and the value of the SMR is acceptable, the other values are not within the accepted range.  The modification indices for the model with a single covariance (see: @tbl-modification-indices-cov-1) show a high modification index for *manichean ~~  peoplecentrism*. However, adding this residual covariance results in a negative correlation between these two items. Further inspection of the modification index shows multicollinearity issues, with correlations exceeding 1, between Manichean and people centrism. 

As such, we choose to re-run the model and include the residual covariance for *peoplecentrism  ~~  antielitism*. The modification indices show that these items also score high (see: @tbl-modification-indices-cov-1). There is also good theoretical reason to include a residual covariance between these items. People-centrism and anti-elitism tap into two polar opposites of populism, pitting the people against the elite. The model with two residual covariances (see @tbl-cfa-cov-2_unstd) shows a good model fit. However, we have similar issues with the RMSEA as with the previous models. 

To improve the model fit, we include an additional residual covariance. The modification indices (@tbl-modification-indices-cov-2) demonstrates that two sets of items have high scores: *manichean~~  antielitism* and *manichean ~~  peoplecentrism*. We include residual covariances between *manichean~~  antielitism* since the modification index is slightly higher, and also due to the earlier issues with *manichean~~  peoplecentrism* and the negative correlation between these two items. Again, there is good reason to believe Manicheanism and anti-elitism have shared variance. Both items tap into the antagonistic nature of populism, i.e. the idea of the good people versus the corrupt elite. This model with three residual covariances (see @tbl-cfa-cov-3_unstd) produces an excellent model fit. The model produces a high CFI, while the RMSEA and the SRMR are well within the accepted range. The Chi-squared is also not statistically significant, implying that the difference between the predicted and the observed covariance structures is very small. The modification indices (see: @tbl-modification-indices-cov-3) also show no remaining high modification scores, indicating that the model is well specified with three residual covariances. 

The model with three residual covariances has 2 degrees of freedom, which is close to what is generally considered an acceptable complexity level. However, the model remains theoretically justified and provides a significantly improved fit compared to the simpler models. The fit indices indicate that this model captures the underlying structure of the data well. An examination of the residuals also shows that the model has a good fit. @tbl-residuals-cov-3-cov shows the covariances of residuals. Most of the values are near zero, suggesting a good fit. @tbl-residuals-cov-3-cov-z shows the standardized residual covariances, all of the values are less than 2, which is considered acceptable. 

\newpage

::: {.latex}

\singlespacing

:::

## CFA Models 

\vspace{10pt}

```{r}
#| echo: false
#| output: asis
#| tbl-cap: CFA Baseline Model (Unstandardized Estimates)
#| label: tbl-cfa-full_unstd

# CFA table for the baseline model
         
cat('
\\begin{table}
\\centering
\\footnotesize  
\\input{cfa_tables/fit_basic_full.tex}
\\hspace*{2cm}{\\parbox{0.6\\linewidth}{\\textit{Note}: All estimates are unstandardized.}}
\\end{table}
')

```

\vspace{10pt}

```{r}
#| echo: false
#| output: asis
#| tbl-cap: Standardized Factor Loadings for Populism CFA (Baseline Model)
#| label: tbl-cfa-full_baseline_std

# CFA table for the baseline model 
         
standardizedSolution(fit_basic_full) %>%
  dplyr::filter(op == "=~") %>%
  select(rhs, est.std, se, pvalue) %>%
  rename(
    Item = rhs,
    `Standardized Estimate` = est.std,
    `Std. Error` = se,
    `p-value` = pvalue
  ) %>%
  mutate(Item = forcats::fct_recode(factor(Item),
                               Manichean = "manichean",
                               "General Will" = "generalwill",
                               "People Centrism" = "peoplecentrism",
                               "Anti-Elitism" = "antielitism",
                               Indivisible    = "indivisible"
  ) ) %>%
  kable(digits = 3,
        align = "lrrr",
        booktabs = TRUE,
        format = "latex") %>%
  kable_styling(font_size = 8) 

```

\newpage

```{r}
#| echo: false
#| output: asis
#| tbl-cap: CFA with One Residual Covariance (Unstandardized Estimates)
#| label: tbl-cfa-cov-1_unstd

# CFA table for 1 covariance
         
cat('
\\begin{table}
\\centering
\\footnotesize  
\\input{cfa_tables/fit_cov_1_full.tex}
\\hspace*{1cm}{\\parbox{0.6\\linewidth}{\\textit{Note}: All estimates are unstandardized.}}
\\end{table}
')

```

\vspace{10pt}

```{r}
#| echo: false
#| output: asis
#| tbl-cap: Standardized Factor Loadings for Populism CFA (One Residual Covariance)
#| label: tbl-cfa-full_cov_1_std
#| 

# CFA table for the model with 1 residual covariance
         
standardizedSolution(fit_cov_1_full) %>%
  dplyr::filter(op == "=~") %>%
  select(rhs, est.std, se, pvalue) %>%
  rename(
    Item = rhs,
    `Standardized Estimate` = est.std,
    `Std. Error` = se,
    `p-value` = pvalue
  ) %>%
  mutate(Item = forcats::fct_recode(factor(Item),
                               Manichean = "manichean",
                               "General Will" = "generalwill",
                               "People Centrism" = "peoplecentrism",
                               "Anti-Elitism" = "antielitism",
                               Indivisible    = "indivisible"
  ) ) %>%
  kable(digits = 3,
        align = "lrrr",
        booktabs = TRUE,
        format = "latex") %>%
  kable_styling(font_size = 8)

```

\newpage

```{r}
#| echo: false
#| output: asis
#| tbl-cap: CFA with Two Residual Covariances (Unstandardized Estimates)
#| label: tbl-cfa-cov-2_unstd

# CFA table for two residual covariances
         
cat('
\\begin{table}
\\centering
\\footnotesize  
\\input{cfa_tables/fit_cov_2_full.tex}
\\hspace*{0.25cm}{\\parbox{0.6\\linewidth}{\\textit{Note}: All estimates are unstandardized.}}
\\end{table}
')

```

\vspace{10pt}

```{r}
#| echo: false
#| output: asis
#| tbl-cap: Standardized Factor Loadings for Populism CFA (Two Residual Covariances)
#| label: tbl-cfa-full_cov_2_std

# CFA table for the model with 2 residual covariances 
         
standardizedSolution(fit_cov_2_full) %>%
  dplyr::filter(op == "=~") %>%
  select(rhs, est.std, se, pvalue) %>%
  rename(
    Item = rhs,
    `Standardized Estimate` = est.std,
    `Std. Error` = se,
    `p-value` = pvalue
  ) %>%
  mutate(Item = forcats::fct_recode(factor(Item),
                               Manichean = "manichean",
                               "General Will" = "generalwill",
                               "People Centrism" = "peoplecentrism",
                               "Anti-Elitism" = "antielitism",
                               Indivisible    = "indivisible"
  ) ) %>%
  kable(digits = 3,
        align = "lrrr",
        booktabs = TRUE,
        format = "latex") %>%
  kable_styling(font_size = 8)

```

\newpage

```{r}
#| echo: false
#| output: asis
#| tbl-cap: CFA with Three Residual Covariances (Unstandardized Estimates)
#| label: tbl-cfa-cov-3_unstd


# CFA table for three residual covariances
         
cat('
\\begin{table}
\\centering
\\footnotesize  
\\input{cfa_tables/fit_cov_3_full_appendix.tex}
\\hspace*{0.25cm}{\\parbox{0.6\\linewidth}{\\textit{Note}: All estimates are unstandardized.}}
\\end{table}
')

```

\vspace{10pt}

```{r}
#| echo: false
#| output: asis
#| tbl-cap: Standardized Factor Loadings for Populism CFA (Three Residual Covariances)
#| label: tbl-cfa-full_cov_3_std

# CFA table for the model with 3 residual covariances 
         
standardizedSolution(fit_cov_3_full) %>%
  dplyr::filter(op == "=~") %>%
  select(rhs, est.std, se, pvalue) %>%
  rename(
    Item = rhs,
    `Standardized Estimate` = est.std,
    `Std. Error` = se,
    `p-value` = pvalue
  ) %>%
  mutate(Item = forcats::fct_recode(factor(Item),
                               Manichean = "manichean",
                               "General Will" = "generalwill",
                               "People Centrism" = "peoplecentrism",
                               "Anti-Elitism" = "antielitism",
                               Indivisible    = "indivisible"
  ) ) %>%
  kable(digits = 3,
        align = "lrrr",
        booktabs = TRUE,
        format = "latex") %>%
  kable_styling(font_size = 8)

```

\newpage

### Modification Indices Tables for CFA Models

\vspace{20pt}


```{r}
#| echo: false
#| tbl-cap: Modification Indices (Baseline Model)
#| label: tbl-modification-indices-basic

# We create tables for the modification indices. 
# We use the models from above. 

# For the baseline model


modificationindices(fit_basic_full, sort = TRUE) %>%
  select(lhs, op, rhs, mi, epc) %>%
  kable(row.names = FALSE) %>%
  kable_classic() 

```

\vspace{40pt}

```{r}
#| echo: false
#| tbl-cap: Modification Indices (One Residual Covariance)
#| label: tbl-modification-indices-cov-1


# We create tables for the modification indices. 
# We use the models from above. 

# For the model with one residual covariance


modificationindices(fit_cov_1_full, sort = TRUE) %>%
  select(lhs, op, rhs, mi, epc) %>%
  kable(row.names = FALSE) %>%
  kable_classic() 

```


\vspace{40pt}

\newpage

```{r}
#| echo: false
#| tbl-cap: Modification Indices with Manichean and People Centrism
#| label: tbl-modification-indices-cov-2-manichean


# We create tables for the modification indices. 
# We use the models from above. 

# For the model with two residual covariances with manichean and people centrism

modificationindices(fit_cov_2_full_manichean, sort = TRUE) %>%
  select(lhs, op, rhs, mi, epc) %>%
  kable(row.names = FALSE) %>%
  kable_classic()

```


\vspace{30pt}

```{r}
#| echo: false
#| tbl-cap: Modification Indices (Two Residual Covariances)
#| label: tbl-modification-indices-cov-2

# We create tables for the modification indices. 
# We use the models from above. 

# For the model with two residual convariances (second version)


modificationindices(fit_cov_2_full, sort = TRUE) %>%
  select(lhs, op, rhs, mi, epc) %>%
  kable(row.names = FALSE) %>%
  kable_classic() 

```

\vspace{20pt}

```{r}
#| echo: false
#| tbl-cap: Modification Indices (Three Residual Covariances)
#| label: tbl-modification-indices-cov-3

# We create tables for the modification indices. 
# We use the models from above. 

# For the model with three residual covariances

modificationindices(fit_cov_3_full, sort = TRUE) %>%
  select(lhs, op, rhs, mi, epc) %>%
  kable(row.names = FALSE) %>%
  kable_classic()

```

\newpage

### Covariances of Residuals

\vspace{20pt}

```{r}
#| echo: false
#| tbl-cap: Covariances of Residuals (Three Residual Covariances)
#| label: tbl-residuals-cov-3-cov


residuals_cov_3 <- lavResiduals(fit_cov_3_full, type = "cor")

residuals_cov_3[["cov"]] %>%
  as.data.frame() %>%
  `rownames<-`(c("Manichean", 
                 "General Will", 
                 "People Centrism", 
                 "Anti-Elitism", 
                 "Indivisible")) %>%  # Directly replace the old row names with new names
  rename(Manichean = manichean,
         `General Will` = generalwill,
         `People Centrism` = peoplecentrism,
         `Anti-Elitism` = antielitism,
         Indivisible = indivisible) %>%
  mutate(across(everything(), ~ round(.x, 3))) %>%
  {
    mat <- as.matrix(.)
    mat[upper.tri(mat)] <- ""  # Replace upper triangle with blanks
    as.data.frame(mat)
  } %>%
  rownames_to_column(var = "Variable") %>%
  kable()  %>%
  kable_classic()


```

\vspace{20pt}

```{r}
#| echo: false
#| tbl-cap: Standardized Covariances of Residuals (Three Residual Covariances)
#| label: tbl-residuals-cov-3-cov-z


residuals_cov_3[["cov.z"]] %>%
  as.data.frame() %>%
  `rownames<-`(c("Manichean", 
                 "General Will", 
                 "People Centrism", 
                 "Anti-Elitism", 
                 "Indivisible")) %>%  
  rename(Manichean = manichean,
         `General Will` = generalwill,
         `People Centrism` = peoplecentrism,
         `Anti-Elitism` = antielitism,
         Indivisible = indivisible) %>%
  mutate(across(everything(), ~ round(.x, 3))) %>%
  {
    mat <- as.matrix(.)
    mat[upper.tri(mat)] <- ""  # Replace upper triangle with blanks
    as.data.frame(mat)
  } %>%
  rownames_to_column(var = "Variable") %>%
  kable()  %>%
  kable_classic()


```

\newpage

### Multigroup Confirmatory Factor Analysis Models

\vspace{20pt}

```{r}
#| echo: false
#| results: asis
#| tbl-cap: CFA Baseline Model (Unstandardized Estimates)
#| label: tbl-cfa-multi-group-basic


# CFA multigroup baseline model

cat('
\\begin{table}
\\centering
\\footnotesize  
\\input{cfa_tables/fit_mgm_basic.tex}
\\hspace*{.25cm}{\\parbox{0.6\\linewidth}{\\textit{Note}: All estimates are unstandardized.}}
\\end{table}
')


```

\newpage

```{r}
#| echo: false
#| results: asis
#| tbl-cap: CFA with One Residual Covariance (Unstandardized Estimates)
#| label: tbl-cfa-multi-group-cov-1


# CFA multigroup model with one residual covariance

cat('
\\begin{table}
\\centering
\\footnotesize  
\\input{cfa_tables/fit_mgm_cov_1.tex}
\\hspace*{-0.8cm}{\\parbox{0.6\\linewidth}{\\textit{Note}: All estimates are unstandardized.}}
\\end{table}
')


```

\newpage

```{r}
#| echo: false
#| results: asis
#| tbl-cap: CFA with Two Residual Covariances (Unstandardized Estimates)
#| label: tbl-cfa-multi-group-cov-2


# CFA multigroup model with two residual covariances

cat('
\\begin{table}
\\centering
\\footnotesize  
\\input{cfa_tables/fit_mgm_cov_2.tex}
\\hspace*{-1.5cm}{\\parbox{0.6\\linewidth}{\\textit{Note}: All estimates are unstandardized.}}
\\end{table}
')

```

\newpage

```{r}
#| echo: false
#| results: asis
#| tbl-cap: CFA with Three Residual Covariances (Unstandardized Estimates)
#| label: tbl-cfa-multi-group-cov-3


# CFA multigroup model with three residual covariances

cat('
\\begin{table}
\\centering
\\footnotesize  
\\input{cfa_tables/fit_mgm_cov_3_appendix.tex}
\\hspace*{-1.5cm}{\\parbox{0.6\\linewidth}{\\textit{Note}: All estimates are unstandardized.}}
\\end{table}
')


```

\newpage

## Invariance Test for All Models

\vspace{20pt}

```{r}
#| echo: false
#| output: asis
#| label: tbl-invariance-all-models
#| tbl-cap: Invariance Test (All Models)

# We create a table for the invariance models. 
# We ran the models above so as to avoid issues of colliding variables. 

# We convert the data to a LaTeX-friendly table.

combined_lavTestLRT_results_xtable <- xtable(combined_lavTestLRT_results)

# We print the LaTeX table.

print(
  combined_lavTestLRT_results_xtable, 
  type = "latex", 
  booktabs = TRUE, 
  include.rownames = FALSE,
  comment = FALSE,
  floating = FALSE
)


```

\newpage

## Item Response Theory for Wave 1 and Wave 2

\vspace{40pt}

```{r}
#| include: false

# We run the IRT for Wave 1.

df_irt_five_wave_1 <- df_integrated %>%
  dplyr::filter(wave == "Wave 1 - 2018") %>%
  select(manichean,                
         generalwill,       
         peoplecentrism,    
         antielitism,
         indivisible)


df_irt_five_wave_1 <- na.omit(df_irt_five_wave_1)

# Since our variable is not an integer we round the values off.

df_irt_five_wave_1 <- df_irt_five_wave_1 %>%
  mutate(across(everything(), round))  


# Fit the model using the GRM for polytomous (ordered) items

fit_5_wave_1 <- mirt(data = df_irt_five_wave_1, model = model_graded, itemtype = "graded")

```

```{r}
#| echo: false
#| fig-width: 14
#| fig-height: 8
#| fig-cap: "Item Information Plots Wave 1"
#| label: item-information-wave-1

# We apply the function from above for the Item Information for Wave 1

theta_values_wave_1 <- seq(-3, 3, by = 0.1)
item_names_wave_1 <- colnames(extract.mirt(fit_5_wave_1, 'data'))
info_results_wave_1 <- extract_info_data(fit_5_wave_1, theta_values_wave_1, item_names_wave_1)
info_results_wave_1$Item <- factor(info_results_wave_1$Item, levels = item_names_wave_1)

# We plot information data.

info_results_wave_1 %>%
  mutate(Item = forcats::fct_recode(Item, !!!populist_items),  
         Item = forcats::fct_relevel(Item, !!!populism_items_order)) %>%
  ggplot(aes(x = Theta, y = Information, color = Item)) +
  geom_line(linewidth = 1) +
  facet_wrap(~ Item) +
  theme_bw() +
  labs(x = expression(theta),
       y = "Information") +
  theme(plot.title = element_text(hjust = 0.5), 
        legend.position = "bottom")

```

\newpage

```{r}
#| echo: false
#| eval: true
#| include: false

# We run the IRT for Wave 2.

df_irt_five_wave_2 <- df_integrated %>%
  dplyr::filter(wave == "Wave 2 - 2023") %>%
  select(manichean,                
         generalwill,       
         peoplecentrism,    
         antielitism,
         indivisible)


df_irt_five_wave_2 <- na.omit(df_irt_five_wave_2)

# Since our variable is not an integer we round the the values off to the nearest integer. 

df_irt_five_wave_2 <- df_irt_five_wave_2 %>%
  mutate(across(everything(), round)) %>%  
  mutate(across(everything(), ~ . - 1)) 

# We fit the model using the GRM for polytomous (ordered) items.

fit_5_wave_2 <- mirt(data = df_irt_five_wave_2, model = model_graded, itemtype = "graded")


```

```{r}
#| echo: false
#| fig-width: 14
#| fig-height: 8
#| label: item-information-wave-2
#| fig-cap: "Item Information Plots Wave 2"

# We apply the function from above for the Item Information for Wave 2. 

# We process full dataset for the Item Information plots.
# We use the fitted IRT model (fit_5_wave_2).

theta_values_wave_2 <- seq(-3, 3, by = 0.1)
item_names_wave_2 <- colnames(extract.mirt(fit_5_wave_2, 'data'))
info_results_wave_2 <- extract_info_data(fit_5_wave_2, theta_values_wave_2, item_names_wave_2)
info_results_wave_2$Item <- factor(info_results_wave_2$Item, levels = item_names_wave_2)

# We plot the information data.
info_results_wave_2 %>%
  mutate(Item = forcats::fct_recode(Item, !!!populist_items),  # Recode items for Wave 2 dataset
         Item = forcats::fct_relevel(Item, !!!populism_items_order)) %>%
  ggplot(aes(x = Theta, y = Information, color = Item)) +
  geom_line(linewidth = 1) +
  facet_wrap(~ Item) +
  theme_bw() +
  labs(x = expression(theta),
       y = "Information") +
  theme(plot.title = element_text(hjust = 0.5), 
       legend.position = "bottom")

```

\newpage

```{r}
#| echo: false
#| fig-width: 14
#| fig-height: 8
#| fig-cap: "Item Proability Function (Wave 1)"
#| label: ipf-wave-1


# We apply the function from above for the Item Probability Function from wave 1. 

# We process full dataset for Item Probability Function. 
# We use the fitted IRT model (fit_5_wave_1).

theta_values_wave_1 <- seq(-3, 3, by = 0.1)
item_names_wave_1 <- colnames(extract.mirt(fit_5_wave_1, 'data'))
results_wave_1 <- extract_prob_data(fit_5_wave_1, theta_values_wave_1, item_names_wave_1)

# Convert item and category to factors for better labeling in the plot
results_wave_1$Item <- factor(results_wave_1$Item, levels = item_names_wave_1)
results_wave_1$Category <- factor(results_wave_1$Category, levels = unique(results_wave_1$Category))

# We plot using ggplot2 with the custom palette and linetypes for Wave 1.
results_wave_1 %>%
  mutate(Item = forcats::fct_recode(Item, !!!populist_items),  
         Item = forcats::fct_relevel(Item, !!!populism_items_order)) %>%
  ggplot(aes(x = Theta, y = Probability, color = Category, linetype = Category)) +
  geom_line(linewidth = 1) +
  facet_wrap(~ Item, scales = "free_y") +
  theme_bw() +
  labs(x = expression(theta),
       y = expression(P(theta))) +
  scale_color_manual(values = category_colors) +
  scale_linetype_manual(values = lty_vector) +  
  theme(plot.title = element_text(hjust = 0.5), 
        legend.position = "bottom",
        legend.direction = "horizontal",
        legend.box = "horizontal") +  
  guides(color = guide_legend(title = "Items", nrow = 1), 
         linetype = guide_legend(title = "Items", nrow = 1))

```

\newpage

```{r}
#| echo: false
#| fig-width: 14
#| fig-height: 8
#| fig-cap: "Item Proability Function (Wave 2)"
#| label: ipf-full-wave-2

# We apply the function from above for the Item Probability Function from wave 2. 

# We define a sequence of theta values for Wave 2.
theta_values_wave_2 <- seq(-3, 3, by = 0.1)

# We get the original item names from the Wave 2 model.
item_names_wave_2 <- colnames(extract.mirt(fit_5_wave_2, 'data'))

# We use the function to extract the probability data for Wave 2.
results_wave_2 <- extract_prob_data(fit_5_wave_2, theta_values_wave_2, item_names_wave_2)

# We convert item and category to factors for better labeling in the plot.
results_wave_2$Item <- factor(results_wave_2$Item, levels = item_names_wave_2)
results_wave_2$Category <- factor(results_wave_2$Category, levels = unique(results_wave_2$Category))

# We plot using ggplot2 with the custom palette and linetypes for Wave 2.
results_wave_2 %>%
  mutate(Item = forcats::fct_recode(Item, !!!populist_items),  
         Item = forcats::fct_relevel(Item, !!!populism_items_order)) %>%
  ggplot(aes(x = Theta, y = Probability, color = Category, linetype = Category)) +
  geom_line(linewidth = 1) +
  facet_wrap(~ Item, scales = "free_y") +
  theme_bw() +
  labs(x = expression(theta),
       y = expression(P(theta))) +
  scale_color_manual(values = category_colors) +
  scale_linetype_manual(values = lty_vector) +  
  theme(plot.title = element_text(hjust = 0.5), 
        legend.position = "bottom",
        legend.direction = "horizontal",
        legend.box = "horizontal") +  
  guides(color = guide_legend(title = "Items", nrow = 1), 
         linetype = guide_legend(title = "Items", nrow = 1))


```

\newpage

## Correlations 

\vspace{30pt}

```{r}
#| echo: false
#| output: asis
#| tbl-cap: Correlation Matrix for Populist Items
#| label: tbl-correlations-populism

# We run a correlation matrix for the populism items using the modelsummary package.

populism_correlation <- df_integrated %>%
  select(Manichean = manichean, 
         Indivisible = indivisible, 
         `General Will` = generalwill, 
         `People Centrism` = peoplecentrism, 
         `Anti-Elitism` = antielitism) 

if (requireNamespace("correlation", quietly = TRUE)) {
  co <- correlation::correlation(populism_correlation)
  datasummary_correlation(co)
  
  # add stars to easycorrelation objects
  datasummary_correlation(co, stars = TRUE,
                          output = 'kableExtra')
}

```

\newpage

```{r}
#| echo: false
#| output: asis
#| tbl-cap: Correlation Matrix for Attaching Ideologies
#| label: tbl-correlations-attaching-ideologies

# We run a correlation matrix for the attaching ideologies using the modelsummary package.


correlation_attaching_ideologies <- df_integrated %>%
  select(Populism = populism_cfa_resc,
         `Left-Right Economy` = lrecon,
         Nativism = nativism,
         `Lifestyle (rev)` = lifestyle_rev,
         `Law and Order` = laworder,
         Authoritarianism = authoritarian) 

if (requireNamespace("correlation", quietly = TRUE)) {
  co <- correlation::correlation(correlation_attaching_ideologies)
  datasummary_correlation(co)
  
  datasummary_correlation(co, stars = TRUE,
                          output = 'kableExtra') %>%
    kable_classic() %>% 
    kable_styling(font_size = 8)
}

```

\newpage

## Change between Waves per Party Family

\vspace{30pt}

```{r}
#| echo: false
#| fig-height: 8
#| fig-width: 12
#| tbl-cap: Difference per Party Family for Populism (Median, Mean, Standard Deviation)
#| label: tbl-difference-family-table-populism

# We create tables for populism for the median, mean and standard deviation.
# We do this since the plots are can be difficult to read in terms of actual values. 

summary_table_populism <- df_parties_both_waves %>%
  dplyr::filter(family_label != "Confessional") %>%
  dplyr::filter(!is.na(populism_cfa_both_resc)) %>%  
  group_by(family_label, poppa_id, party_short) %>%
  arrange(poppa_id, wave) %>%
  reframe(diff_populism = diff(populism_cfa_both_resc)) %>%
  group_by(family_label) %>%
  dplyr::summarize(
    median_diff_populism = median(diff_populism, na.rm = TRUE),
    mean_diff_populism = mean(diff_populism, na.rm = TRUE),
    sd_diff_populism = sd(diff_populism, na.rm = TRUE)
  ) %>%
  ungroup() 


summary_table_populism %>%
  kable(
    format = "latex", 
    booktabs = TRUE, 
    linesep = "", 
    digits = 2, 
    col.names = c("Party Family", "Median Diff Populism", "Mean Diff Populism", "SD Diff Populism")
  ) %>%
  add_header_above(c(" " = 1, "Populism Difference Statistics" = 3)) %>%
  kable_classic() %>%
  kable_styling(latex_options = c("striped", "scale_down"), 
                full_width = FALSE, 
                position = "center", 
                font_size = 10) %>%
  column_spec(2:4, width = "5cm")

```

\newpage

## OLS Regression Models with the Populism as the Mean

\vspace{20pt}

```{r}
#| echo: false
#| tbl-cap: Populist Party Characteristics (With Populism Mean)
#| label: tbl-pop-char-ols-mean

# We run the regression models for position with the populism mean variable. 

models_pop_char_mean <- list(
  "Model 1"  = lm(populism_mean ~ nativism + country + wave, data = df_integrated),
  "Model 2"  = lm(populism_mean ~ lrecon + country + wave, data = df_integrated),
  "Model 3"  = lm(populism_mean ~ authoritarian + country + wave, data = df_integrated),
  "Model 4"  = lm(populism_mean ~ nativism + lrecon + country + wave, data = df_integrated)
  
)


modelsummary::modelsummary(models_pop_char_mean, 
                           estimate = "{estimate}{stars}",
                           output = 'kableExtra',
                           coef_map = cm_char,
                           gof_map = c("nobs", "r.squared")) %>%
  kable_styling(font_size = 8) %>%
  kableExtra::add_footnote(
    label = c("With country fixed effects", 
              "Standard errors are shown in parentheses",
              "+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001"),
    notation = "none"
  )

```

\newpage

```{r}
#| echo: false
#| tbl-cap: Interaction with Wave (With Populism Mean)
#| label: tbl-interaction-wave-mean

# We run the regression models for the interactions with wave with the populism mean variable. 

models_interaction_wave_mean <- list(
  "Model 1 (FD)" = lm(populism_mean ~ wave + country, data = df_integrated),
  "Model 2 (FD)" = lm(populism_mean ~ lrecon_centered + wave*nativism_centered + country, data = df_integrated),
  "Model 3 (FD)" = lm(populism_mean ~ nativism_centered + wave*lrecon_centered + country, data = df_integrated),
  "Model 4 (FD)" = lm(populism_mean ~ lrecon_centered + wave*authoritarian_centered + country, data = df_integrated),
  "Model 5 (DSP)" = lm(populism_mean ~ wave + country, data = df_parties_both_waves),
  "Model 6 (DSP)" = lm(populism_mean ~ lrecon_centered + wave*nativism_centered + country, data = df_parties_both_waves),
  "Model 7 (DSP)" = lm(populism_mean ~ lrecon_centered + wave*authoritarian_centered + country, data = df_parties_both_waves)
)

modelsummary::modelsummary(models_interaction_wave_mean, 
                          estimate = "{estimate}{stars}",                           
                          output = 'kableExtra',
                          coef_map = cm_wave_centered,
                          gof_map = c("nobs", "r.squared")) %>%
  kable_styling(full_width = FALSE, font_size = 8) %>%
  kable_styling(font_size = 8) %>%
  column_spec(2:7, width = "5em") %>%  # Adjust the width of the first column
  kableExtra::add_footnote(
    label = c("With country fixed effects; FD: Full dataset; DSP: Dataset with shared parties", 
              "Standard errors are shown in parentheses",
              "+ p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001"),
    notation = "none"
  )

```

## Number of Experts Per Country

```{r}
#| echo: false
#| message: false
#| tbl-cap: Number of Experts Per Country
#| label: tbl-experts-per-country

# Country Level

# We import the expert-level data. This dataset includes party IDs and the number of experts per party, per item.
# The data is structured at the expert level (rows) with multiple rows per party.
# The number of experts can vary per party and per item.
# We filter out those row that have all missing values.

expert_data_2018 <- haven::read_dta(here("data/all_expert.dta")) %>%
  select(poppa_id = party_id,
         ends_with("_nr")) 

expert_data_2018$poppa_id <- as.numeric(expert_data_2018$poppa_id)

# We read the excel file in with the up to date information, i.e. party names etc. 

poppa_excel_2018 <- readxl::read_excel(here("data/POPPA_List_of_parties_2018.xlsx"), sheet = 2) %>%
  janitor::clean_names() %>%
  select(country,
         poppa_id = party_id,
         party_short = abbreviation,
         party_name_english)


# We join the Excel file with the expert-level data to update party information (e.g., names).
# We set any values with fewer than 4 experts to NA to exclude them from analysis.

expert_data_2018_joined <- full_join(expert_data_2018, poppa_excel_2018, by = join_by(poppa_id)) %>%
  select(country,
         poppa_id,
         party_short,
         party_name_english,
         everything()) %>%
  mutate(across(ends_with("_nr"), ~ ifelse(. < 4, NA, .))) 


# We make country a factor in preparation for joining with the 2023 dataset. 

expert_data_2018_joined$country <- forcats::as_factor(expert_data_2018_joined$country)


# We take the first value for each party per item. 
# The values per party per item are the same.
# In essence we are collapsing the data.

expert_data_2018_joined_collapsed <- expert_data_2018_joined %>%
  group_by(country, poppa_id, party_short) %>%
  summarise(across(ends_with("_nr"), first))

# We calculate the mean, min, and max number of experts.
# Since we are using rowwise calculations, these values are calculated per party, across all items.
# Afterward, we summarize the mean, min, and max at the country level by averaging the party-level results.

summary_table_2018_country <- expert_data_2018_joined_collapsed %>%
  rowwise() %>%  
  dplyr::filter(!all(is.na(c_across(ends_with("_nr"))))) %>%  
  mutate(
    mean_experts = round(mean(c_across(ends_with("_nr")), na.rm = TRUE), 2),   
    min_experts = min(c_across(ends_with("_nr")), na.rm = TRUE),              
    max_experts = max(c_across(ends_with("_nr")), na.rm = TRUE)               
  ) %>%
  ungroup() %>%  # Remove rowwise() context to group by country
  group_by(country) %>%
  summarise(
    mean_experts = round(mean(mean_experts, na.rm = TRUE), 2),  
    min_experts = min(min_experts, na.rm = TRUE),               
    max_experts = max(max_experts, na.rm = TRUE)                
  ) %>%
  mutate(wave = "Wave 1 - 2018")  # Add the wave column


# We import the expert-level data for 2023. This dataset includes party IDs and the number of experts per party, per item.
# The data is structured at the expert level (rows) with multiple rows per party.
# The number of experts can vary per party and per item.

expert_data_2023 <- readRDS(here("data/poppa2_expert.rds")) %>%
  janitor::clean_names() %>%
  select(country,
         poppa_id,
         ends_with("_n_experts")) %>% 
  dplyr::filter(!all(is.na(c_across(ends_with("_n_experts"))))) 


# We read the excel file in with the up to date information, i.e. party names etc. 

poppa_excel_2022 <- readxl::read_excel(here("data/POPPA_List_of_parties_2022.xlsx"))  %>%
  janitor::clean_names() %>%
  select(poppa_id = party_id,
         party_short = abbreviation,
         party_name_english)


# We join the Excel file with the expert-level data to update party information (e.g., names).
# We set any values with fewer than 4 experts to NA to exclude them from analysis.

expert_data_2023_joined <- left_join(expert_data_2023, poppa_excel_2022, by = join_by(poppa_id)) %>%
  select(country,
         poppa_id,
         party_short,
         party_name_english,
         everything())  %>%
  mutate(across(ends_with("_n_experts"), ~ ifelse(. < 4, NA, .)))  # Replace values with NA if experts < 4


# We fix the following country names to match the data from 2018. 

expert_data_2023_joined <- expert_data_2023_joined %>%
  mutate(country = forcats::as_factor(country),
         country = forcats::fct_recode(country,
                                       "Czech Republic" = "Czech",
                                       "Belgium - Flanders" = "Flanders",
                                       "Belgium - Wallonia" = "Wallonie"))



# We take the first value for each party per item. 
# The values per party per item are the same.
# In essence we are collapsing the data.

expert_data_2023_joined_collapsed <- expert_data_2023_joined %>%
  group_by(country, poppa_id, party_short) %>%
  summarise(across(ends_with("_n_experts"), first))

# We calculate the mean, min, and max number of experts.
# Since we are using rowwise calculations, these values are calculated per party, across all items.
# Afterward, we summarize the mean, min, and max at the country level by averaging the party-level results.

summary_table_2023_country <- expert_data_2023_joined %>%
  rowwise() %>%  
  dplyr::filter(!all(is.na(c_across(ends_with("_n_experts"))))) %>%  
  mutate(
    mean_experts = round(mean(c_across(ends_with("_n_experts")), na.rm = TRUE), 2),  
    min_experts = min(c_across(ends_with("_n_experts")), na.rm = TRUE),              
    max_experts = max(c_across(ends_with("_n_experts")), na.rm = TRUE)               
  ) %>%
  ungroup() %>%  
  group_by(country) %>%
  summarise(
    mean_experts = round(mean(mean_experts, na.rm = TRUE), 2),  
    min_experts = min(min_experts, na.rm = TRUE),               
    max_experts = max(max_experts, na.rm = TRUE)                
  ) %>%
  mutate(wave = "Wave 2 - 2023")  


# We combine the 2018 and 2023 datasets for a country-level summary of expert evaluations across both waves.

combined_data_country <- dplyr::bind_rows(summary_table_2018_country, summary_table_2023_country)

# We calculate the overall mean, min, and max.

overall_row_country <- combined_data_country %>%
  group_by(wave) %>%
  dplyr::summarize(
    mean_experts = round(mean(mean_experts, na.rm = TRUE), 2),
    min_experts = min(min_experts, na.rm = TRUE),
    max_experts = max(max_experts, na.rm = TRUE),
  ) %>%
  mutate(country = "Overall")

# We relocate the 'country' column to ensure the correct order.

overall_row_country <- overall_row_country %>%
  dplyr::relocate(country, .before = mean_experts)

# We bind the new row to the dataset.

combined_data_country <- bind_rows(combined_data_country, overall_row_country)

# We pivot the dataset wider based on the 'wave' column.

combined_data_wide_country <- combined_data_country %>%
  pivot_wider(names_from = wave,
              values_from = c(mean_experts, min_experts, max_experts),
              names_glue = "{.value}_{wave}"
  ) %>% 
  select(
    Country = country,
    `Mean Experts Wave 1` = `mean_experts_Wave 1 - 2018`,
    `Min Experts Wave 1` = `min_experts_Wave 1 - 2018`,
    `Max Experts Wave 1` = `max_experts_Wave 1 - 2018`,
    `Mean Experts Wave 2` =  `mean_experts_Wave 2 - 2023`,
    `Min Experts Wave 2` = `min_experts_Wave 2 - 2023`,
    `Max Experts Wave 2` = `max_experts_Wave 2 - 2023`
  ) %>%
  mutate(Country = as.character(Country)) %>%  
  mutate(flag = if_else(Country == "Overall", 1, 0)) %>%  
  arrange(flag, Country) %>%                            
  select(-flag)    


# We print the table.

combined_data_wide_country %>%
  kable(
    format = "latex",  
    booktabs = TRUE,
    linesep = "") %>% 
  kable_classic() %>% 
  kable_styling(
    full_width = FALSE, 
    position = "center", 
    bootstrap_options = c("striped", "hover")
  ) %>%
  kable_styling(font_size = 8) %>%
  column_spec(2:7, width = "5em") %>% 
  row_spec(nrow(combined_data_wide_country), bold = TRUE) 

```




